INDEX
Explanations
phrases involving the act of throwing or discarding, often metaphorically
New Auto-Interp
Negative Logits
cke
-0.18
endon
-0.16
ummings
-0.16
SSION
-0.15
ienda
-0.15
abis
-0.15
cts
-0.14
ilon
-0.14
ckt
-0.14
owa
-0.14
POSITIVE LOGITS
caution
0.28
shade
0.26
away
0.26
tantr
0.25
back
0.23
-away
0.23
away
0.23
aways
0.22
aside
0.22
down
0.22
Activations Density 0.034%