INDEX
Explanations
terms related to political controversies and mainstream media narratives
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.16
0.7%
597
+0.14
0.6%
1810
+0.13
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
597
+0.16
0.04
478
+0.14
0.02
1870
+0.13
0.02
Negative Logits
<bos>
-1.36
intersper
-0.81
rile
-0.73
Tja
-0.71
shenan
-0.70
philanth
-0.70
verwijspagina
-0.69
underval
-0.69
allude
-0.67
catast
-0.65
POSITIVE LOGITS
mainstream
0.90
alpes
0.81
grises
0.75
fides
0.72
fons
0.69
corrom
0.69
groupName
0.67
ErrorCode
0.66
tristes
0.64
omenclature
0.64
Activations Density 0.550%