INDEX
Explanations
punctuation marks and designations in a document, indicating a focus on formatting or structural elements
New Auto-Interp
Negative Logits
Feinstein
-0.17
uga
-0.16
emes
-0.15
egas
-0.15
oding
-0.14
uggle
-0.14
456
-0.14
çĬ
-0.14
oral
-0.14
Amb
-0.14
POSITIVE LOGITS
fraction
0.20
Fraction
0.17
licative
0.16
fractions
0.15
ãĤ¤ãĥ¤
0.15
Fraction
0.15
ç§
0.15
анÑģ
0.15
ivÄĽ
0.14
enville
0.14
Activations Density 0.027%