INDEX
Explanations
mentions of international politics and agreements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1145
+0.15
0.5%
131
+0.12
0.4%
1363
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1145
+0.15
0.07
1363
+0.12
0.06
981
+0.11
0.07
Negative Logits
Tikang
-0.55
towany
-0.51
bună
-0.49
BigNumberish
-0.48
asserole
-0.47
znego
-0.47
potrivit
-0.45
dziew
-0.44
sumowanie
-0.44
CFU
-0.44
POSITIVE LOGITS
maksi
1.20
kac
1.16
alkoh
1.06
kram
1.06
silikon
1.04
panik
1.04
seksi
1.02
ekos
1.01
kasa
1.00
jaya
0.99
Activations Density 0.370%