INDEX
Explanations
document formatting indicators, specifically the beginning of sections and hierarchical organization
preceding various hyphenated words
phrases starting with common adjectives or possessives
New Auto-Interp
Negative Logits
-0.90
↵
-0.88
↵↵
-0.83
[…]
-0.78
...
-0.72
-0.69
…
-0.69
...
-0.68
[...]
-0.63
-
-0.62
POSITIVE LOGITS
rungsseite
0.95
تانيه
0.91
Datuak
0.90
ModelExpression
0.88
незавершена
0.87
Савезне
0.84
الحره
0.81
arşivlendi
0.80
HostException
0.80
تضيفلها
0.80
Activations Density 0.000%