INDEX
Explanations
references to specific political or activist groups and their actions
New Auto-Interp
Negative Logits
ı
-0.17
hardt
-0.15
orno
-0.14
ãĥIJãĤ¤
-0.14
ìķĻ
-0.14
arga
-0.14
ota
-0.14
ãĤĤãĤĬ
-0.14
aisal
-0.14
onta
-0.14
POSITIVE LOGITS
Fal
0.19
regime
0.18
Bucc
0.16
ex
0.16
cult
0.16
Geld
0.16
democratic
0.16
Moj
0.15
iyon
0.15
ollar
0.15
Activations Density 0.005%