INDEX
Explanations
punctuation marks, particularly periods and commas
New Auto-Interp
Negative Logits
olt
-0.17
ucci
-0.16
год
-0.16
lett
-0.15
TRS
-0.14
esini
-0.14
ستÛĮ
-0.14
/npm
-0.14
487
-0.14
eling
-0.14
POSITIVE LOGITS
bah
0.16
Ster
0.15
ikhail
0.15
sdale
0.15
onces
0.15
abez
0.15
oba
0.14
èİ
0.14
ohana
0.14
uni
0.14
Activations Density 0.004%