INDEX
Explanations
references to military actions and geopolitics
New Auto-Interp
Negative Logits
tron
-0.16
erguson
-0.15
Ebola
-0.15
dumped
-0.15
antium
-0.14
iler
-0.14
wich
-0.14
Cair
-0.14
絡
-0.14
UpDown
-0.14
POSITIVE LOGITS
Ukraine
0.35
Ukrain
0.35
Ukrainian
0.34
Ky
0.33
Ukr
0.29
ky
0.28
Biden
0.27
Uk
0.26
Putin
0.26
Ky
0.26
Activations Density 0.043%