INDEX
Explanations
the beginning of text or correspondence
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.29
6.2%
1741
+0.06
1.2%
50
+0.05
1.0%
Correlated Neurons
Index
P. Corr.
Cos Sim.
674
+0.29
0.12
1713
+0.06
0.74
108
+0.05
0.57
Negative Logits
massacres
-1.07
disinformation
-0.98
atrocities
-0.96
bombings
-0.93
fascism
-0.92
insurgency
-0.92
mismanagement
-0.91
traitors
-0.91
insurgents
-0.91
bloodshed
-0.90
POSITIVE LOGITS
<bos>
16.99
expandindo
2.80
GEBURTSDATUM
2.78
betweenstory
2.67
Administrativna
2.58
Autoritní
2.57
تقاوى
2.54
dispen
2.49
Италијани
2.38
Italijani
2.36
Activations Density 0.958%