INDEX
Explanations
phrases related to being outside or external to something
references to the concept of being "outside"
New Auto-Interp
Negative Logits
enegger
-0.87
iji
-0.83
ruary
-0.73
lda
-0.69
irez
-0.69
ered
-0.69
inately
-0.68
lication
-0.67
amaz
-0.67
fect
-0.67
POSITIVE LOGITS
observer
0.82
linebacker
0.74
Outs
0.72
world
0.72
most
0.72
workings
0.71
diameter
0.71
mole
0.70
world
0.69
linebackers
0.69
Activations Density 0.040%