INDEX
Explanations
negative contractions and phrases expressing doubt or uncertainty
New Auto-Interp
Negative Logits
<bos>
-1.56
leſs
-0.98
againſt
-0.98
ſelf
-0.98
ſelves
-0.93
iſt
-0.92
doubtnut
-0.92
Anſ
-0.91
rungsseite
-0.91
leſs
-0.91
POSITIVE LOGITS
didn
0.85
it
0.78
doesn
0.73
wasn
0.73
It
0.71
shouldn
0.69
I
0.69
don
0.69
a
0.67
wouldn
0.66
Activations Density 0.063%