INDEX
Explanations
phrases indicating negation or denial
affirmative statements or declarations
New Auto-Interp
Negative Logits
iership
-0.91
ãĤ¼ãĤ¦ãĤ¹
-0.71
ript
-0.68
rn
-0.67
RAFT
-0.63
rex
-0.63
tro
-0.63
arthy
-0.62
assies
-0.61
tnc
-0.61
POSITIVE LOGITS
kidding
1.09
etheless
1.06
matter
0.94
xious
0.93
wonder
0.92
zzle
0.92
terday
0.91
doubt
0.89
longer
0.89
Matter
0.82
Activations Density 0.052%