INDEX
Explanations
phrases related to being confined or restricted in some way
references to being confined or restricted
New Auto-Interp
Negative Logits
issance
-1.00
resso
-0.79
lus
-0.75
ciation
-0.74
annis
-0.74
lication
-0.70
lene
-0.70
arenthood
-0.69
enegger
-0.69
Lynd
-0.67
POSITIVE LOGITS
door
0.82
doors
0.80
locked
0.79
picking
0.76
onto
0.74
locked
0.74
shut
0.73
doors
0.72
chests
0.71
door
0.69
Activations Density 0.031%