INDEX
Explanations
comparative phrases indicating proportions or quantities
New Auto-Interp
Negative Logits
ught
-0.16
orate
-0.16
228
-0.15
nues
-0.15
onas
-0.14
luet
-0.14
ICLE
-0.14
cazzo
-0.14
iman
-0.14
shint
-0.14
POSITIVE LOGITS
portion
0.16
portions
0.15
abet
0.15
majority
0.15
thousands
0.15
Majority
0.15
ãģ®ãģ¯
0.14
ubat
0.14
anio
0.14
ucer
0.14
Activations Density 0.065%