INDEX
Explanations
information related to political events and official statements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1445
+0.17
0.5%
752
+0.14
0.4%
198
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1445
+0.17
0.06
1003
+0.14
0.03
16
+0.13
0.05
Negative Logits
essment
-0.61
putes
-0.59
måneder
-0.59
esserts
-0.59
]--;
-0.59
vocable
-0.59
larged
-0.59
purée
-0.58
quitous
-0.58
cipline
-0.57
POSITIVE LOGITS
Khart
1.50
Keny
1.40
Juf
1.39
soulign
1.33
accla
1.32
simplif
1.31
volunte
1.28
confé
1.28
Kün
1.27
pleins
1.27
Activations Density 0.242%