INDEX
Explanations
the presence of the beginning of the text or the start of a new section
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.43
12.5%
1741
+0.05
1.5%
1870
+0.03
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
674
+0.43
0.21
108
+0.05
0.58
1523
+0.03
0.42
Negative Logits
massacres
-0.98
atrocities
-0.87
disinformation
-0.86
bombings
-0.86
corruption
-0.86
insurgents
-0.85
insurgency
-0.84
ABORT
-0.83
fascism
-0.82
traitors
-0.81
POSITIVE LOGITS
<bos>
16.59
expandindo
2.55
GEBURTSDATUM
2.50
betweenstory
2.41
Autoritní
2.35
encomp
2.32
Administrativna
2.31
dispen
2.29
تقاوى
2.29
intersper
2.28
Activations Density 0.956%