INDEX
Explanations
instances of physical confrontations or altercations between people
pronouns referring to individuals involved in various actions or situations
New Auto-Interp
Negative Logits
iencies
-0.87
ãĥ£
-0.66
orie
-0.65
ILCS
-0.65
BN
-0.64
apter
-0.62
lance
-0.62
votes
-0.61
ilib
-0.60
np
-0.59
POSITIVE LOGITS
senseless
1.01
harshly
0.87
merciless
0.86
orally
0.82
violently
0.81
verbally
0.81
unconscious
0.80
relentlessly
0.78
inappropriately
0.78
undermin
0.78
Activations Density 0.164%