INDEX
Explanations
negative and critical language directed towards individuals or groups, particularly in a political context
New Auto-Interp
Negative Logits
spasm
-0.50
رع
-0.48
Plen
-0.47
idiv
-0.46
EMERGENCY
-0.46
orias
-0.45
verz
-0.45
gynhyrchwyd
-0.44
جذ
-0.44
pinta
-0.43
POSITIVE LOGITS
insults
1.24
criticism
1.22
mocking
1.15
insult
1.15
dispar
1.11
attacks
1.10
ridicule
1.08
insulting
1.08
criticize
1.07
attack
1.06
Activations Density 0.514%