INDEX
Explanations
proper nouns, specifically names of people and locations
New Auto-Interp
Negative Logits
sah
-0.15
awner
-0.14
zte
-0.13
lož
-0.13
dni
-0.13
atak
-0.12
зи
-0.12
ëĶ©
-0.12
occan
-0.12
IBLE
-0.12
POSITIVE LOGITS
ian
0.17
our
0.17
shire
0.17
ism
0.16
ory
0.15
ì¦ĺ
0.15
amente
0.15
ous
0.15
istan
0.15
ians
0.14
Activations Density 0.138%