INDEX
Explanations
non-committal phrases following an affirmation
negative prefixes indicating unfairness or unacceptability
New Auto-Interp
Negative Logits
Madden
-0.74
halves
-0.70
Duo
-0.70
briefs
-0.67
slate
-0.65
tee
-0.64
initials
-0.64
mos
-0.64
raids
-0.63
sheet
-0.63
POSITIVE LOGITS
fortunately
1.43
iversity
1.39
usual
1.38
ivers
1.30
stable
1.30
iform
1.29
necessary
1.29
animous
1.27
surprisingly
1.20
employment
1.20
Activations Density 0.021%