INDEX
Explanations
information related to historical events and social issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
453
+0.16
0.5%
776
+0.12
0.4%
1978
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
321
+0.16
0.08
776
+0.12
0.06
573
+0.11
0.05
Negative Logits
<bos>
-1.60
unspeak
-1.34
gratify
-1.13
philosophic
-1.13
ardour
-1.10
endeavouring
-1.09
vainly
-1.08
luxuriant
-1.08
tolerably
-1.07
sophistic
-1.07
POSITIVE LOGITS
alkoh
1.92
silikon
1.87
utop
1.81
mikrofon
1.75
spion
1.74
makro
1.73
kosme
1.71
keramik
1.65
kask
1.64
elek
1.64
Activations Density 0.161%