INDEX
Explanations
references to political or organizational actions and events
New Auto-Interp
Negative Logits
меÑĩ
-0.16
ÏĢοÏħ
-0.15
ovi
-0.15
olang
-0.14
eut
-0.14
FIT
-0.14
ÐľÐŀ
-0.14
wig
-0.13
æ¡ij
-0.13
ìŀ¡
-0.13
POSITIVE LOGITS
majority
0.16
useppe
0.15
Suff
0.15
LD
0.14
recent
0.14
ãĤ¤ãĤº
0.14
æĿ¡
0.14
Marina
0.13
res
0.13
current
0.13
Activations Density 0.017%