INDEX
Explanations
people's names
the presence of proper nouns, specifically names of individuals
New Auto-Interp
Negative Logits
enei
-0.89
atility
-0.84
opard
-0.80
heed
-0.79
etsk
-0.79
compr
-0.78
pered
-0.77
hene
-0.76
ACH
-0.69
sylvania
-0.69
POSITIVE LOGITS
Gathering
0.83
"$:/
0.75
EntityItem
0.73
Ny
0.73
Advisory
0.70
Freder
0.70
icle
0.70
ãĥ¤
0.69
Laur
0.68
istic
0.68
Activations Density 0.016%