INDEX
Explanations
proper nouns, specifically names and locations
New Auto-Interp
Negative Logits
ship
-0.29
sh
-0.28
sel
-0.28
ships
-0.26
sWith
-0.25
side
-0.24
res
-0.23
sy
-0.22
sc
-0.21
shire
-0.21
POSITIVE LOGITS
eker
0.18
-vous
0.17
chio
0.17
kind
0.16
chia
0.16
bral
0.16
dings
0.15
edn
0.15
pillar
0.15
ed
0.15
Activations Density 0.880%