INDEX
Explanations
Cyrillic characters
special characters or symbols, specifically 'ÑĢ' and other variations
New Auto-Interp
Negative Logits
Joy
-0.80
Cause
-0.66
BIL
-0.66
terson
-0.65
Laos
-0.65
help
-0.65
holders
-0.64
auga
-0.64
peed
-0.62
Spur
-0.61
POSITIVE LOGITS
оÐ
1.27
Ñĥ
1.23
а
1.21
и
1.21
о
1.20
е
1.12
Ñĭ
1.02
ÑĮ
0.98
н
0.94
ÑĢ
0.90
Activations Density 0.017%