INDEX
Explanations
proper nouns, especially names of individuals
proper nouns, particularly names of individuals and entities
New Auto-Interp
Negative Logits
anooga
-0.70
ertain
-0.68
Corpus
-0.67
gom
-0.65
asia
-0.61
orporated
-0.61
ovies
-0.60
":[
-0.60
terson
-0.60
Lima
-0.60
POSITIVE LOGITS
senal
1.46
cliffe
0.88
uling
0.81
haps
0.80
fect
0.78
ansom
0.77
ascal
0.76
kj
0.73
interstitial
0.73
utherford
0.71
Activations Density 0.175%