INDEX
Explanations
terms related to political history and actions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.08
0.2%
80
+0.08
0.2%
376
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
376
+0.08
0.04
1856
+0.08
0.04
1793
+0.07
0.03
Negative Logits
utop
-0.79
logis
-0.75
zó
-0.68
Departement
-0.66
republi
-0.65
geolog
-0.63
gallina
-0.61
gita
-0.61
solidar
-0.60
stör
-0.60
POSITIVE LOGITS
history
0.72
history
0.63
History
0.57
HISTORY
0.55
histories
0.55
voud
0.53
tradition
0.53
četně
0.52
record
0.52
textTheme
0.52
Activations Density 0.199%