INDEX
Explanations
proper nouns, specifically names of people and places
New Auto-Interp
Negative Logits
ullan
-0.18
iasi
-0.16
ži
-0.15
otas
-0.15
/posts
-0.15
ystone
-0.15
orama
-0.14
uppe
-0.14
oucher
-0.14
lisi
-0.14
POSITIVE LOGITS
elog
0.14
è¨İ
0.14
lyon
0.14
íĥ
0.14
E
0.14
dec
0.14
วà¸Ķ
0.13
Millenn
0.13
Beacon
0.13
iros
0.13
Activations Density 0.007%