INDEX
Explanations
phrases related to decision-making and consequences
New Auto-Interp
Negative Logits
quite
-0.21
probably
-0.20
both
-0.19
almost
-0.19
if
-0.18
nearly
-0.17
Quite
-0.17
Freeman
-0.17
when
-0.16
darn
-0.16
POSITIVE LOGITS
ï¼ĮåĪĻ
0.28
_______,
0.25
yoksa
0.24
çļĦè¯Ŀ
0.21
ëĿ¼ëıĦ
0.21
æŁIJ
0.20
nÃło
0.19
à¹ĥà¸Ķ
0.19
(any
0.18
plx
0.18
Activations Density 0.394%