INDEX
Explanations
phrases related to coercion or being compelled to do something against one's will
instances of the word "forced" in various contexts
New Auto-Interp
Negative Logits
vironment
-0.68
ership
-0.65
rou
-0.64
iverpool
-0.64
Enh
-0.63
orkshire
-0.63
Offense
-0.62
itect
-0.62
iosyncr
-0.60
amina
-0.59
POSITIVE LOGITS
into
0.94
otom
0.90
aback
0.84
untarily
0.80
overtime
0.75
onto
0.73
shut
0.73
offline
0.69
into
0.68
INTO
0.68
Activations Density 0.049%