INDEX
Explanations
proper nouns, particularly names of places and people
New Auto-Interp
Negative Logits
iminal
-0.16
-0.14
ocos
-0.14
vvm
-0.14
setattr
-0.13
वर
-0.13
/big
-0.13
ÐĴики
-0.13
irket
-0.13
arching
-0.13
POSITIVE LOGITS
shire
0.17
erna
0.16
ians
0.15
zhou
0.15
maal
0.15
pio
0.15
oen
0.15
afen
0.14
ardo
0.14
adena
0.14
Activations Density 0.399%