INDEX
Explanations
phrases related to various societal issues and concepts such as political strategy, governmental practices, and social norms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1314
+0.13
0.4%
1438
+0.11
0.3%
1978
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1314
+0.13
0.08
284
+0.11
0.07
1438
+0.09
0.05
Negative Logits
externi
-0.65
Datuak
-0.65
CiNii
-0.63
OGND
-0.63
persino
-0.60
perciò
-0.60
Билгалдахарш
-0.57
esattamente
-0.57
zarchiwizowane
-0.56
Cechy
-0.56
POSITIVE LOGITS
dinas
0.48
alip
0.45
jawa
0.45
karna
0.44
saar
0.43
saad
0.43
wali
0.43
tuta
0.42
maske
0.42
xdrive
0.40
Activations Density 1.108%