INDEX
Explanations
mentions of escaping or escape-related actions
instances of the word "escape" and its variations, particularly in the context of fleeing or being freed from a situation
New Auto-Interp
Negative Logits
stead
-0.63
ificent
-0.62
inki
-0.61
spir
-0.61
ftime
-0.60
papers
-0.60
¾
-0.59
ledger
-0.59
abouts
-0.58
iann
-0.58
POSITIVE LOGITS
detection
0.98
unsc
0.93
captivity
0.90
confinement
0.86
from
0.85
capture
0.84
punishment
0.84
into
0.81
prosecution
0.78
justice
0.74
Activations Density 0.054%