INDEX
Explanations
phrases related to physical actions or events, particularly those involving mistreatment or shocking behaviors
New Auto-Interp
Negative Logits
orbit
-0.90
Strange
-0.89
é¾
-0.89
mare
-0.88
Introduced
-0.88
operated
-0.88
fleet
-0.88
illin
-0.87
heart
-0.86
glass
-0.86
POSITIVE LOGITS
example
1.58
instance
1.53
geries
1.49
bidden
1.36
purposes
1.35
gery
1.29
starters
1.28
reasons
1.26
awhile
1.26
gotten
1.25
Activations Density 2.404%