INDEX
Explanations
emoticons and symbols
special characters or symbols, particularly a specific character that appears multiple times
New Auto-Interp
Negative Logits
dorm
-0.70
phrine
-0.70
Franch
-0.68
enium
-0.67
uded
-0.65
distilled
-0.65
coales
-0.64
contracted
-0.64
ynthesis
-0.63
guided
-0.62
POSITIVE LOGITS
Ģ
1.35
ðŁĺ
1.23
Ħ
1.21
³
1.18
¼
1.14
Ĥ
1.13
ī
1.13
¢
1.13
¹
1.10
ķ
1.09
Activations Density 0.007%