INDEX
Explanations
phrases indicating honesty or sincerity
New Auto-Interp
Negative Logits
aver
-0.16
âĢı
-0.15
ç¯Ģ
-0.15
na
-0.15
arin
-0.14
gesch
-0.14
Ñģ
-0.14
gas
-0.14
ultimate
-0.14
rous
-0.14
POSITIVE LOGITS
ewis
0.19
iyel
0.18
upp
0.17
-Sah
0.16
æ£ļ
0.15
enance
0.15
heits
0.15
ifax
0.15
LTR
0.14
anchors
0.14
Activations Density 0.024%