INDEX
Explanations
possessive forms of nouns
New Auto-Interp
Negative Logits
z
-0.25
y
-0.22
e
-0.21
i
-0.21
er
-0.20
c
-0.20
al
-0.20
a
-0.20
Ùĩ
-0.19
à¸Ļ
-0.19
POSITIVE LOGITS
ï¸ı
0.18
ζη
0.17
pha
0.14
aman
0.14
ÐĽÐ¬
0.13
/-
0.13
.au
0.13
inn
0.13
/DD
0.13
lef
0.13
Activations Density 0.098%