INDEX
Explanations
characters in different languages
symbols and non-standard characters
New Auto-Interp
Negative Logits
awaru
-1.02
Flavoring
-0.90
merce
-0.88
nings
-0.85
contrace
-0.83
mathemat
-0.76
ciating
-0.75
kef
-0.73
simultane
-0.73
richness
-0.73
POSITIVE LOGITS
ãĤ¡
1.09
ر
0.89
ople
0.83
ι
0.82
inx
0.79
¹
0.77
ħ
0.77
а
0.76
era
0.75
´
0.74
Activations Density 0.004%