INDEX
Explanations
terms related to conspiracy theories and political events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1967
+0.11
0.3%
513
+0.09
0.3%
612
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
74
+0.11
0.03
678
+0.09
0.04
1363
+0.09
0.04
Negative Logits
unceasing
-0.79
disregarding
-0.78
wilfully
-0.77
endeavouring
-0.76
unavoid
-0.73
ineffectual
-0.72
roused
-0.71
thoughtless
-0.71
mercurial
-0.70
languid
-0.70
POSITIVE LOGITS
affez
1.33
meras
1.27
cyr
1.23
fré
1.22
utop
1.22
dises
1.22
alkoh
1.21
lapto
1.21
ideolog
1.19
kosme
1.18
Activations Density 0.189%