INDEX
Explanations
mentions of a specific person named Phil
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
687
+0.17
0.7%
241
+0.15
0.7%
1763
+0.14
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
168
+0.17
0.03
687
+0.15
0.03
629
+0.14
0.02
Negative Logits
Datuak
-0.47
antana
-0.44
viders
-0.43
asantry
-0.43
onews
-0.43
latimes
-0.42
trise
-0.41
borderTop
-0.41
وردار
-0.41
concussion
-0.41
POSITIVE LOGITS
PHIL
1.34
Phil
1.29
phil
1.22
Phil
1.20
PHILIP
1.17
Philip
1.15
phi
1.11
Phillip
1.10
Phi
1.08
Philip
1.06
Activations Density 0.102%