INDEX
Explanations
contractions where the words "not" and/or "do not" follow a pronoun or auxiliary verb
negations or expressions of denial
New Auto-Interp
Negative Logits
amer
-0.72
Site
-0.70
ancer
-0.68
éĥ
-0.67
Intern
-0.65
bard
-0.63
itiz
-0.63
Reviewer
-0.63
inav
-0.63
interstitial
-0.63
POSITIVE LOGITS
necessarily
1.26
bother
1.11
icably
1.06
exactly
1.00
icable
0.97
epad
0.91
bothering
0.90
gonna
0.89
quite
0.88
urtles
0.88
Activations Density 0.088%