INDEX
Explanations
adjectives beginning with 'un'
negations or words beginning with "un-"
New Auto-Interp
Negative Logits
Peb
-0.81
Tackle
-0.77
Pound
-0.76
briefs
-0.75
hetti
-0.73
Ples
-0.73
OPLE
-0.72
Ages
-0.72
Madden
-0.71
Sut
-0.70
POSITIVE LOGITS
fortunately
1.19
usual
1.17
iversity
1.14
likely
1.14
iform
1.08
cles
1.07
iverse
1.06
employment
1.06
expected
1.05
stable
1.05
Activations Density 0.018%