INDEX
Explanations
the negation of statements
phrases expressing negation or denial
New Auto-Interp
Negative Logits
ounters
-0.73
Parenthood
-0.69
ò
-0.67
pione
-0.66
ñ
-0.66
Compass
-0.61
footed
-0.61
Gap
-0.61
Passage
-0.61
emetery
-0.60
POSITIVE LOGITS
kidding
1.08
personally
1.01
myself
0.95
believe
0.88
ashamed
0.87
sure
0.86
hesitate
0.83
joking
0.82
doubt
0.82
necessarily
0.81
Activations Density 0.124%