INDEX
Explanations
references to various societies or organizations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.17
1.0%
101
+0.10
0.6%
245
+0.09
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
101
+0.17
0.02
273
+0.10
0.02
245
+0.09
0.02
Negative Logits
thee
-1.92
yours
-1.87
anything
-1.82
anyone
-1.67
hers
-1.67
anybody
-1.63
TY
-1.61
someone
-1.61
any
-1.58
them
-1.57
POSITIVE LOGITS
Ľ
2.41
°
2.37
ŀ
2.27
ī
2.23
Ļ
2.19
ģ
2.17
ĸ
2.16
Ŀ
2.13
Ĵ
2.06
¡
2.04
Activations Density 0.010%