INDEX
Explanations
phrases related to political events and figures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
776
+0.16
0.5%
1905
+0.13
0.4%
1870
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.16
0.04
818
+0.13
0.04
1654
+0.11
0.04
Negative Logits
autorytatywna
-1.17
GeoNames
-1.06
Administrativna
-0.99
EconPapers
-0.96
новништво
-0.94
invokingState
-0.92
ORGANISM
-0.91
Winaray
-0.91
piram
-0.91
SharedDtor
-0.90
POSITIVE LOGITS
increa
2.12
intersper
2.09
encomp
2.09
depic
2.02
accla
2.02
inconce
2.01
maneu
1.97
impra
1.96
inev
1.96
affor
1.96
Activations Density 0.207%