INDEX
Explanations
politicians' last names
proper nouns, particularly names of politicians and public figures
New Auto-Interp
Negative Logits
lehem
-0.88
Corpus
-0.75
Copyright
-0.68
olulu
-0.68
Redd
-0.64
awks
-0.64
Tasman
-0.64
ãĥŁ
-0.63
Artemis
-0.63
neutron
-0.62
POSITIVE LOGITS
himself
1.22
's
0.97
personally
0.94
Himself
0.87
ism
0.85
swore
0.82
aides
0.79
Care
0.77
isms
0.77
lied
0.77
Activations Density 0.174%