INDEX
Explanations
mentions of media and related terminology
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.16
0.9%
451
+0.12
0.7%
371
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
371
+0.16
0.03
451
+0.12
0.02
38
+0.12
0.01
Negative Logits
IJ
-2.32
ľĵ
-2.20
§
-2.16
Į
-2.14
ij
-1.97
ĩ
-1.96
¤
-1.89
Ķ
-1.82
ĥ½
-1.81
ķ
-1.79
POSITIVE LOGITS
eval
2.22
wiki
2.16
film
2.08
ieux
1.89
ural
1.87
films
1.87
works
1.81
outlets
1.70
processor
1.69
ulator
1.66
Activations Density 0.092%