INDEX
Explanations
negative constructions and expressions of disappointment or inadequacy
New Auto-Interp
Negative Logits
nl
-0.15
enheim
-0.15
possibile
-0.15
kenin
-0.15
daq
-0.15
GI
-0.15
meer
-0.15
okers
-0.15
atar
-0.14
issement
-0.14
POSITIVE LOGITS
bad
0.25
outright
0.24
Bad
0.23
BAD
0.21
_bad
0.20
necessarily
0.20
Bad
0.19
icÃŃ
0.17
.scalablytyped
0.17
bad
0.17
Activations Density 0.185%