INDEX
Explanations
names of places and cultural references
New Auto-Interp
Negative Logits
adel
-0.17
fff
-0.16
ναν
-0.15
Ñĥда
-0.15
uchen
-0.14
hen
-0.14
umi
-0.14
kır
-0.14
umar
-0.14
hi
-0.13
POSITIVE LOGITS
ég
0.20
ág
0.19
zt
0.19
zo
0.17
AGED
0.16
ereg
0.16
zer
0.16
ietet
0.16
apot
0.15
rung
0.15
Activations Density 0.002%