INDEX
Explanations
specific names of people, organizations, and locations
New Auto-Interp
Negative Logits
ģm
-0.36
erm
-0.30
ám
-0.29
äm
-0.29
orm
-0.29
óm
-0.28
ãĥ¼ãĥł
-0.28
mam
-0.28
irm
-0.28
imum
-0.28
POSITIVE LOGITS
à¸Ļà¸Ń
0.13
ınca
0.13
tıģ
0.12
çĶ
0.12
tıģı
0.12
isen
0.11
даеÑĤ
0.10
ılıp
0.10
;line
0.10
ξε
0.10
Activations Density 0.456%