INDEX
Explanations
intensifiers and degree adverbs that amplify descriptions
New Auto-Interp
Negative Logits
ones
-0.18
.appspot
-0.16
ombs
-0.15
aż
-0.14
reme
-0.14
adas
-0.14
abad
-0.14
htt
-0.14
Ones
-0.14
subroutine
-0.13
POSITIVE LOGITS
easy
0.17
edly
0.17
important
0.15
Easy
0.15
easy
0.15
likely
0.15
_easy
0.14
rage
0.14
-important
0.14
/stdc
0.14
Activations Density 0.121%