INDEX
Explanations
phrases indicating variability or options available within context
New Auto-Interp
Negative Logits
stuff
-0.16
pper
-0.14
ste
-0.14
pe
-0.14
Hud
-0.14
akeup
-0.14
ocker
-0.14
isp
-0.14
Contents
-0.14
querque
-0.13
POSITIVE LOGITS
/all
0.22
place
0.20
/e
0.20
kind
0.19
combination
0.18
where
0.18
combination
0.17
THING
0.16
hoo
0.16
uzzer
0.16
Activations Density 0.052%