INDEX
Explanations
mentions of future events or plans
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.12
0.4%
1967
+0.11
0.4%
663
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
663
+0.12
0.04
122
+0.11
0.04
47
+0.11
0.05
Negative Logits
Wtf
-0.54
Whence
-0.53
Miscell
-0.51
ducato
-0.50
unspeak
-0.50
forskj
-0.50
érard
-0.49
betreffende
-0.49
triomphe
-0.48
veiligheid
-0.47
POSITIVE LOGITS
ideolog
0.76
bej
0.71
intende
0.70
diabe
0.69
notor
0.68
republi
0.68
atify
0.67
solidar
0.65
ambul
0.63
gambe
0.63
Activations Density 0.201%