INDEX
Explanations
references to significant events or entities, possibly related to entertainment or politics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1350
+0.14
0.5%
1757
+0.13
0.5%
481
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
405
+0.14
0.05
481
+0.13
0.05
1757
+0.13
0.05
Negative Logits
textnormal
-0.62
triangleq
-0.61
indescri
-0.53
encomp
-0.53
apprehen
-0.51
Chalet
-0.51
affor
-0.50
bigsqcup
-0.50
purcha
-0.50
ftw
-0.50
POSITIVE LOGITS
big
1.12
big
1.11
BIG
1.06
BIG
1.05
Big
1.03
Big
1.02
bigger
0.79
Biggs
0.76
bigger
0.66
大的
0.61
Activations Density 0.103%