INDEX
Explanations
mentions of the organization "Antifa"
references to specific political groups, particularly Antifa and related individuals
New Auto-Interp
Negative Logits
mong
-0.84
matic
-0.84
borne
-0.80
race
-0.80
marked
-0.76
manship
-0.73
bone
-0.72
itialized
-0.70
link
-0.70
stall
-0.69
POSITIVE LOGITS
ignt
1.10
IELD
1.01
ÄŁ
0.91
zzle
0.89
qs
0.80
udeb
0.79
ça
0.79
ieri
0.79
zza
0.77
unction
0.76
Activations Density 0.016%