INDEX
Explanations
phrases indicating quantity or amount
New Auto-Interp
Negative Logits
ãģ¡ãĤĩ
-0.14
owers
-0.14
/Dk
-0.14
clair
-0.14
ernen
-0.13
*.
-0.13
@student
-0.13
gratuites
-0.13
upro
-0.13
handful
-0.13
POSITIVE LOGITS
similar
0.23
similar
0.21
enough
0.20
same
0.18
same
0.17
ë³´ëĭ¤
0.16
sufficiently
0.16
benzer
0.15
suf
0.15
quite
0.15
Activations Density 0.070%