INDEX
Explanations
phrases related to social actions or directions
prepositions and directional terms
New Auto-Interp
Negative Logits
hler
-0.53
iera
-0.53
counting
-0.52
pection
-0.51
xtap
-0.50
tallied
-0.50
quad
-0.50
imov
-0.48
compounded
-0.48
Ö¼
-0.48
POSITIVE LOGITS
oneself
0.75
Yourself
0.59
rue
0.55
someday
0.55
anytime
0.53
opic
0.53
yourself
0.52
certain
0.52
your
0.51
anything
0.51
Activations Density 0.942%