INDEX
Explanations
phrases related to being restricted or confined
occurrences of the word "locked"
New Auto-Interp
Negative Logits
lus
-0.87
resso
-0.83
annis
-0.71
lication
-0.70
arenthood
-0.67
umbn
-0.67
esan
-0.66
article
-0.65
mony
-0.65
nor
-0.64
POSITIVE LOGITS
picking
0.93
locked
0.85
locked
0.83
recess
0.77
heed
0.76
lock
0.72
lock
0.71
locking
0.71
horns
0.71
fist
0.69
Activations Density 0.011%