INDEX
Explanations
statements negating or denying some kind of association or influence
phrases that emphasize lack or negation
New Auto-Interp
Negative Logits
Eid
-0.66
Foley
-0.65
Metatron
-0.65
Polk
-0.63
Falk
-0.63
Citizens
-0.63
McCabe
-0.61
Greenberg
-0.61
Extensions
-0.61
Flo
-0.59
POSITIVE LOGITS
uncertain
1.04
way
0.99
hurry
0.96
doubt
0.89
WAY
0.87
particular
0.84
osaurs
0.81
xus
0.81
cific
0.81
danger
0.80
Activations Density 0.036%