INDEX
Explanations
references to physical altercations or fights
instances of conflict or physical confrontations
New Auto-Interp
Negative Logits
akeru
-0.72
Label
-0.71
eele
-0.70
stocking
-0.67
engineering
-0.63
ãģĹ
-0.63
certific
-0.62
extra
-0.62
label
-0.62
SOURCE
-0.62
POSITIVE LOGITS
ensued
1.26
between
1.18
involving
0.91
halla
0.87
brawl
0.85
manship
0.85
raged
0.84
amongst
0.84
erupted
0.83
between
0.82
Activations Density 0.195%