INDEX
Explanations
proper nouns, particularly names or brands
New Auto-Interp
Negative Logits
eling
-0.19
elli
-0.18
idon
-0.17
elly
-0.17
elle
-0.17
elson
-0.16
es
-0.15
ese
-0.15
ellan
-0.15
et
-0.15
POSITIVE LOGITS
entine
0.28
entina
0.26
uable
0.25
uation
0.24
leys
0.23
entin
0.23
uations
0.23
uetype
0.23
val
0.22
=val
0.21
Activations Density 0.025%