INDEX
Explanations
Cyrillic characters
specific Cyrillic characters, particularly related to the letter "н" and variations of it
New Auto-Interp
Negative Logits
merce
-0.86
undai
-0.84
ichita
-0.82
perature
-0.81
kins
-0.81
atche
-0.78
yip
-0.77
eanor
-0.77
HCR
-0.74
pload
-0.74
POSITIVE LOGITS
и
1.15
оÐ
1.07
а
1.06
н
1.02
ÑĤ
1.02
о
1.00
е
0.97
Ñĭ
0.96
к
0.93
Ñ
0.93
Activations Density 0.008%