INDEX
Explanations
phrases emphasizing beginnings or introductory elements in various contexts
New Auto-Interp
Negative Logits
only
-0.20
only
-0.18
ancak
-0.17
ONLY
-0.16
neither
-0.15
právÄĽ
-0.15
atleast
-0.15
åĦ
-0.15
hanya
-0.15
least
-0.15
POSITIVE LOGITS
plain
0.26
ifying
0.21
ifi
0.21
ifiable
0.20
Plain
0.20
plain
0.18
ifies
0.18
IFI
0.17
Plain
0.17
ifications
0.17
Activations Density 0.199%