INDEX
Explanations
proper nouns, like names of people and places
proper nouns, specifically names of people or characters
New Auto-Interp
Negative Logits
LEASE
-0.70
rium
-0.67
Corpus
-0.65
berra
-0.65
Pixie
-0.65
Ñı
-0.65
Lex
-0.64
ktop
-0.61
ÑĮ
-0.60
.ãĢį
-0.59
POSITIVE LOGITS
enegger
1.10
testified
0.97
himself
0.90
wrote
0.84
told
0.81
admitted
0.80
herself
0.80
remembers
0.80
admits
0.79
penned
0.79
Activations Density 0.136%