INDEX
Explanations
references to prisoners or captivity
terms related to prisoners of war
New Auto-Interp
Negative Logits
orp
-0.79
Boll
-0.76
orie
-0.71
laus
-0.65
wig
-0.65
ulously
-0.64
OPA
-0.63
alore
-0.62
ripp
-0.62
amera
-0.62
POSITIVE LOGITS
prisoners
1.05
captives
0.91
inmates
0.88
prisoner
0.85
sentenced
0.81
detainees
0.80
incarcerated
0.75
icts
0.74
jailed
0.73
hostages
0.73
Activations Density 0.034%