INDEX
Explanations
phrases that involve sending or directing someone or something to a specific location or outcome
New Auto-Interp
Negative Logits
okrat
-0.16
wards
-0.15
ewed
-0.14
oga
-0.14
eg
-0.14
ibi
-0.14
ews
-0.14
uing
-0.14
————————————————
-0.14
_sidebar
-0.13
POSITIVE LOGITS
packing
0.35
Packing
0.28
packing
0.22
home
0.20
forth
0.20
sent
0.19
tum
0.19
PACK
0.18
into
0.18
Spir
0.17
Activations Density 0.031%