INDEX
Explanations
phrases related to personal actions and interpersonal interactions
New Auto-Interp
Negative Logits
rather
-0.74
ortment
-0.69
itton
-0.67
yrinth
-0.63
ciplinary
-0.61
unknown
-0.61
okingly
-0.61
not
-0.61
accompanied
-0.60
Discussion
-0.60
POSITIVE LOGITS
anymore
2.03
nor
1.57
anything
1.39
any
1.32
anybody
1.26
whatsoever
1.26
slightest
1.23
anywhere
1.17
ANY
1.16
anyone
1.12
Activations Density 1.974%