INDEX
Explanations
Russian words with specific characters
specific Cyrillic characters and letters, particularly those used in certain Slavic languages
New Auto-Interp
Negative Logits
gdala
-0.74
iqueness
-0.74
ndra
-0.72
arton
-0.70
ayer
-0.69
reconc
-0.68
cause
-0.68
reads
-0.67
¿½
-0.67
guyen
-0.66
POSITIVE LOGITS
оÐ
1.29
и
1.18
о
1.15
а
1.13
ÑĢ
1.09
л
1.09
ÑĤ
1.08
Ñĥ
1.05
Ñģ
0.97
е
0.93
Activations Density 0.010%