INDEX
Explanations
phrases related to actions of taking someone somewhere or taking a photo
references to characters and relationships in narratives
New Auto-Interp
Negative Logits
etheless
-0.79
icion
-0.70
cess
-0.65
ept
-0.64
itated
-0.61
riot
-0.61
ificent
-0.60
raid
-0.60
regate
-0.60
erie
-0.59
POSITIVE LOGITS
hostage
1.29
aback
1.24
seriously
1.11
away
1.02
apart
0.98
Seriously
0.98
captive
0.94
prisoner
0.93
offline
0.92
into
0.88
Activations Density 0.119%