INDEX
Explanations
words related to violence involving relationships and conflict
New Auto-Interp
Negative Logits
eret
-0.80
Wide
-0.73
taboola
-0.70
ometer
-0.70
oute
-0.70
akeru
-0.68
antine
-0.67
ĸļ
-0.67
eele
-0.67
soType
-0.65
POSITIVE LOGITS
unborn
0.98
innocent
0.97
messenger
0.93
entire
0.92
whistle
0.87
intruder
0.86
unarmed
0.86
innoc
0.85
offending
0.81
hostages
0.80
Activations Density 0.178%