INDEX
Explanations
words and phrases associated with agreement and commitment
New Auto-Interp
Negative Logits
prob
-0.15
dney
-0.12
ensex
-0.12
à¸Ńà¸ļ
-0.12
YPRE
-0.11
æ³°
-0.11
afone
-0.11
apol
-0.10
CSI
-0.10
coration
-0.10
POSITIVE LOGITS
the
0.47
the
0.32
the
0.32
,the
0.29
The
0.27
The
0.26
_the
0.26
â̦the
0.24
.the
0.24
The
0.22
Activations Density 7.637%