INDEX
Explanations
verb forms related to making threats
instances of threats or threatening behavior
New Auto-Interp
Negative Logits
gart
-1.01
angles
-0.81
mys
-0.80
gae
-0.78
æ©Ł
-0.76
coat
-0.76
olds
-0.75
rite
-0.74
tein
-0.73
cise
-0.71
POSITIVE LOGITS
retaliation
0.99
retribution
0.89
repr
0.87
annihilation
0.86
lessly
0.80
eviction
0.79
violence
0.75
expulsion
0.74
termination
0.74
suicide
0.73
Activations Density 0.045%