INDEX
Explanations
proper nouns, particularly names of people and places
New Auto-Interp
Negative Logits
POCH
-0.17
ynet
-0.17
raya
-0.17
forman
-0.16
leta
-0.15
Ïĩν
-0.15
itele
-0.15
lds
-0.15
.nano
-0.14
ä¸ĢåĮº
-0.14
POSITIVE LOGITS
uren
0.18
lingen
0.17
elen
0.17
Fernandez
0.16
ij
0.16
eren
0.16
Bur
0.15
eden
0.15
ellen
0.15
Eden
0.15
Activations Density 0.012%