INDEX
Explanations
phrases indicating a small quantity or limited number of items
New Auto-Interp
Negative Logits
amen
-0.17
uet
-0.14
rej
-0.14
uto
-0.14
ORIES
-0.14
ent
-0.14
only
-0.14
gang
-0.13
ned
-0.13
739
-0.13
POSITIVE LOGITS
dozen
0.24
/all
0.17
ibrator
0.16
málo
0.16
kiye
0.16
деÑģÑıÑĤ
0.16
人çļĦ
0.16
ynom
0.15
-times
0.15
dalÅ¡ÃŃch
0.14
Activations Density 0.041%