INDEX
Explanations
phrases related to political news and governmental actions, specifically referencing international relations and political tensions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1741
+0.27
1.0%
1967
+0.18
0.7%
50
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
16
+0.27
0.10
1967
+0.18
0.08
50
+0.13
0.07
Negative Logits
ypeł
-0.65
pensieri
-0.60
commenti
-0.60
@[+][
-0.57
idać
-0.57
messaggi
-0.56
))^{-0.55
gardien
-0.54
sentimenti
-0.54
drage
-0.54
POSITIVE LOGITS
Perci
0.74
Áng
0.73
Apare
0.71
stiamo
0.68
Darío
0.67
Diez
0.66
Egli
0.65
Haci
0.63
vuol
0.62
pié
0.59
Activations Density 0.380%