INDEX
Explanations
strong emotional language and criticism in political contexts
New Auto-Interp
Negative Logits
ocument
-0.76
scope
-0.74
ICLE
-0.70
aea
-0.70
rh
-0.69
ometime
-0.69
emetery
-0.68
OTE
-0.68
minster
-0.68
matic
-0.67
POSITIVE LOGITS
critics
1.03
commenters
0.94
critic
0.88
mockery
0.88
hypocrisy
0.87
bullies
0.84
detractors
0.84
merciless
0.84
accusing
0.83
harshly
0.83
Activations Density 1.185%