INDEX
Explanations
words and phrases indicating intention and actions related to giving or taking advantage
New Auto-Interp
Negative Logits
ymb
-0.16
zero
-0.15
hton
-0.15
ismet
-0.14
-zero
-0.14
zero
-0.14
986
-0.14
VL
-0.14
apore
-0.14
elite
-0.14
POSITIVE LOGITS
especial
0.20
due
0.18
mue
0.17
satisfaction
0.17
within
0.17
scope
0.17
facilities
0.16
leave
0.16
toler
0.16
fitting
0.16
Activations Density 0.349%