INDEX
Explanations
words related to physical or verbal threats or coercion
instances of intimidation and related actions
New Auto-Interp
Negative Logits
ports
-0.80
Foot
-0.79
izen
-0.75
noon
-0.74
launch
-0.73
empl
-0.72
argo
-0.71
odes
-0.71
night
-0.70
Fault
-0.69
POSITIVE LOGITS
intimidation
1.05
tactics
0.97
intimidated
0.96
intimid
0.87
intimidate
0.87
coer
0.86
stalking
0.82
retaliation
0.78
tactic
0.78
blackmail
0.75
Activations Density 0.028%