INDEX
Explanations
phrases related to doors and closed spaces
New Auto-Interp
Negative Logits
ting
-0.67
orp
-0.63
cies
-0.63
udeb
-0.62
TING
-0.62
uala
-0.61
ter
-0.61
TY
-0.60
Err
-0.59
icators
-0.59
POSITIVE LOGITS
neighbor
0.83
steps
0.82
neighbour
0.79
door
0.76
ħĭ
0.75
ways
0.69
wife
0.67
pupil
0.64
neighbors
0.64
pathy
0.62
Activations Density 7.491%