INDEX
Explanations
references to containing or holding items or substances
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
32
+0.14
0.5%
1810
+0.12
0.4%
369
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
369
+0.14
0.02
1096
+0.12
0.02
1810
+0.12
0.02
Negative Logits
Argumento
-0.51
buquerque
-0.48
Dallas
-0.45
Mumbai
-0.45
Figure
-0.45
Friday
-0.44
phi
-0.44
Big
-0.43
Friday
-0.43
Dallas
-0.43
POSITIVE LOGITS
CONTAIN
1.08
Contain
1.02
contain
1.01
Contains
0.92
contain
0.92
poff
0.91
Containing
0.91
ftu
0.90
contains
0.90
aen
0.88
Activations Density 0.075%