INDEX
Explanations
numerical values and references to quantities or counts
New Auto-Interp
Negative Logits
Already
-0.18
just
-0.18
å°±åľ¨
-0.17
already
-0.17
especially
-0.17
Already
-0.16
alespoÅĪ
-0.16
orsch
-0.16
iddi
-0.15
atleast
-0.15
POSITIVE LOGITS
handful
0.29
(<
0.26
TOTAL
0.25
thôi
0.24
tiny
0.24
total
0.23
tiny
0.23
TOTAL
0.21
barely
0.21
duy
0.21
Activations Density 0.166%