INDEX
Explanations
phrases related to accountability and holding individuals or entities responsible
New Auto-Interp
Negative Logits
osit
-0.16
lex
-0.15
HING
-0.15
RuleContext
-0.15
thuis
-0.14
sbin
-0.14
adb
-0.14
ounder
-0.14
992
-0.14
atable
-0.14
POSITIVE LOGITS
hostage
0.34
accountable
0.34
ransom
0.30
sway
0.27
prisoner
0.25
alo
0.25
captive
0.25
steady
0.21
held
0.20
held
0.19
Activations Density 0.035%