INDEX
Explanations
instances of the word "in" used in various contexts
New Auto-Interp
Negative Logits
aurant
-0.17
nutshell
-0.16
cui
-0.16
Sense
-0.15
sofar
-0.15
onu
-0.15
ninger
-0.14
stood
-0.14
ISMATCH
-0.14
vented
-0.13
POSITIVE LOGITS
concert
0.24
lock
0.22
dro
0.21
-step
0.18
step
0.18
stages
0.18
ern
0.17
fits
0.17
turn
0.17
sync
0.17
Activations Density 0.155%