INDEX
Explanations
the beginning and end of text sections
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.44
5.6%
1741
+0.06
0.8%
1870
+0.04
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
800
+0.44
0.33
15
+0.06
0.32
354
+0.04
0.32
Negative Logits
unspeak
-2.69
reluct
-2.49
disgra
-2.46
unlaw
-2.46
shenan
-2.35
impractica
-2.34
ineffec
-2.31
horrend
-2.31
impra
-2.28
disagre
-2.25
POSITIVE LOGITS
<bos>
14.44
GEBURTSDATUM
2.55
expandindo
2.53
betweenstory
2.49
Autoritní
2.46
تقاوى
2.20
Italijani
2.16
Administrativna
2.16
Paglinawan
2.12
kasarigan
2.11
Activations Density 0.089%