INDEX
Explanations
proper nouns with a specific pattern
proper nouns and names
New Auto-Interp
Negative Logits
predec
-0.83
cryst
-0.69
avorite
-0.68
pection
-0.65
Thumbnails
-0.64
ishable
-0.64
sufficient
-0.63
theless
-0.61
artif
-0.61
bulky
-0.61
POSITIVE LOGITS
Rowling
0.85
EStream
0.74
zyk
0.70
schild
0.69
pod
0.69
nik
0.69
Sabha
0.68
ernaut
0.68
xon
0.67
bush
0.66
Activations Density 0.397%