INDEX
Explanations
statements indicating action or decision-making
words and phrases indicating direct statements or actions
New Auto-Interp
Negative Logits
tis
-0.77
currently
-0.76
enei
-0.74
alg
-0.70
wayne
-0.67
arel
-0.66
ordinarily
-0.65
linked
-0.62
gae
-0.61
typically
-0.61
POSITIVE LOGITS
wrong
0.77
inappropriately
0.77
Doct
0.71
LAST
0.70
mistakes
0.69
beforehand
0.69
last
0.69
yesterday
0.67
ocument
0.65
ãĤ¤ãĥĪ
0.62
Activations Density 0.510%