INDEX
Explanations
negative statements where the subject doesn't understand, recommend, like, know, have, or feel comfortable
negative or dismissive phrases indicating disbelief or lack of understanding
New Auto-Interp
Negative Logits
ãĤ¶
-0.94
NetMessage
-0.80
once
-0.76
Reader
-0.73
unless
-0.72
æ©Ł
-0.71
emonium
-0.71
[]
-0.70
ãĤ¤ãĥĪ
-0.68
ĸļ
-0.68
POSITIVE LOGITS
already
1.14
comply
0.86
careful
0.85
resolve
0.83
agree
0.81
satisf
0.81
sufficiently
0.80
succeed
0.80
cooperate
0.76
suffice
0.75
Activations Density 0.097%