INDEX
Explanations
mentions of grid-related concepts or elements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
255
+0.11
0.7%
392
+0.11
0.6%
301
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
255
+0.11
0.01
301
+0.11
0.01
392
+0.10
0.01
Negative Logits
uses
-1.89
Copyright
-1.79
suppose
-1.78
supposed
-1.69
somebody
-1.65
indeed
-1.62
shortest
-1.60
pronounced
-1.60
plaintiff
-1.58
chosen
-1.58
POSITIVE LOGITS
aggio
2.01
bank
1.99
emos
1.94
antry
1.88
-âĤ¬
1.87
âĢIJ
1.85
ball
1.83
notes
1.83
ĻĤ
1.82
IFIC
1.81
Activations Density 0.186%