INDEX
Explanations
phrases related to political and governmental actions, as well as specific individuals and their roles in society
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1052
+0.14
0.5%
752
+0.12
0.4%
897
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1052
+0.14
0.06
1218
+0.12
0.04
897
+0.11
0.04
Negative Logits
ביוגרפיה
-0.51
Appellee
-0.50
Affirmed
-0.49
קישורים
-0.47
estaw
-0.43
worin
-0.42
鹸
-0.42
QtGui
-0.41
AFFIRMED
-0.41
ślę
-0.40
POSITIVE LOGITS
lele
0.91
maksi
0.89
naer
0.84
territo
0.82
lü
0.81
akku
0.81
Whence
0.78
truk
0.77
kamb
0.77
Kere
0.76
Activations Density 0.167%