INDEX
Explanations
names of individuals
mentions of the name "Charles."
New Auto-Interp
Negative Logits
oops
-0.68
atical
-0.66
ded
-0.64
bably
-0.63
acial
-0.62
aths
-0.61
compr
-0.61
unden
-0.61
leased
-0.60
yrim
-0.60
POSITIVE LOGITS
Manson
1.16
Dickens
1.14
Barkley
1.08
Darwin
1.05
worth
1.04
olini
0.95
Schw
0.95
Scrib
0.92
Laugh
0.86
Xavier
0.85
Activations Density 0.015%