INDEX
Explanations
words written in a specific script
occurrences of a specific character in text, likely related to the Cyrillic alphabet
New Auto-Interp
Negative Logits
riott
-0.84
icio
-0.74
aido
-0.72
arching
-0.69
anced
-0.69
arios
-0.69
Pyth
-0.69
urst
-0.66
rament
-0.66
20439
-0.65
POSITIVE LOGITS
м
1.16
Ð
1.15
н
1.11
к
1.11
Ñı
1.11
ÑĤ
1.10
д
1.10
и
1.06
в
1.06
л
1.04
Activations Density 0.007%