INDEX
Explanations
prepositions and verbs related to movement or action
prepositions and phrases indicating direction or purpose
New Auto-Interp
Negative Logits
cember
-0.68
Released
-0.65
teasp
-0.58
uber
-0.58
pection
-0.57
aird
-0.57
jo
-0.57
arnaev
-0.56
iam
-0.56
ird
-0.56
POSITIVE LOGITS
oneself
0.94
whatever
0.81
anything
0.80
addons
0.77
whichever
0.75
uate
0.70
WARD
0.70
any
0.67
something
0.67
Yourself
0.65
Activations Density 0.534%