INDEX
Explanations
language related to threats and intimidation
New Auto-Interp
Negative Logits
ittel
-0.16
ENDED
-0.15
esel
-0.15
yme
-0.14
aná
-0.14
iban
-0.14
obe
-0.14
otten
-0.14
.oracle
-0.14
-spe
-0.14
POSITIVE LOGITS
-threat
0.16
threatens
0.15
threatened
0.15
-toggler
0.15
threats
0.15
addock
0.15
gas
0.14
intimid
0.14
ingly
0.14
threatening
0.14
Activations Density 0.064%