INDEX
Explanations
references to locations or structures being inside of something
references to various locations
New Auto-Interp
Negative Logits
QL
-0.77
emale
-0.73
arcity
-0.73
folk
-0.73
eday
-0.71
wives
-0.71
ingle
-0.70
cific
-0.70
uten
-0.70
ocious
-0.69
POSITIVE LOGITS
walls
1.09
confines
1.08
bubble
1.00
doors
0.98
box
0.94
cage
0.93
envelope
0.91
gates
0.91
boxes
0.89
mouths
0.86
Activations Density 0.122%