INDEX
Explanations
comparisons indicating that something is not as bad as perceived
comparative phrases indicating relative conditions or qualities
New Auto-Interp
Negative Logits
rev
-0.69
uld
-0.67
ainer
-0.67
antage
-0.65
acent
-0.64
iverse
-0.64
itory
-0.63
REE
-0.62
heast
-0.62
reated
-0.62
POSITIVE LOGITS
lihood
0.85
©¶æ
0.83
pects
0.79
opposed
0.77
evidenced
0.76
bestos
0.74
possible
0.73
ij士
0.71
usual
0.68
criptions
0.68
Activations Density 0.088%