INDEX
Explanations
Russian Cyrillic characters, possibly related to text encoding
occurrences of the character "о" and similar Cyrillic characters
New Auto-Interp
Negative Logits
Clover
-0.92
esville
-0.85
hyde
-0.79
ologically
-0.79
secut
-0.73
ometimes
-0.72
raints
-0.71
eele
-0.69
irable
-0.69
Ada
-0.69
POSITIVE LOGITS
Ñ
1.21
ÑĢ
1.18
е
1.18
н
1.14
ÑĤ
1.13
Ñģ
1.11
о
1.09
и
1.04
Ð
1.01
Ñı
1.00
Activations Density 0.003%