INDEX
Explanations
categories, perhaps about famous individuals named Charles
occurrences of the name "Charles" across various contexts
New Auto-Interp
Negative Logits
atical
-0.78
senal
-0.78
compr
-0.71
atically
-0.68
awaru
-0.66
¥µ
-0.65
yrim
-0.65
bably
-0.65
aths
-0.64
atics
-0.64
POSITIVE LOGITS
Dickens
1.10
worth
1.09
Barkley
1.09
Manson
1.07
Darwin
1.00
Schw
0.94
Koch
0.92
olini
0.88
Scrib
0.87
Grassley
0.87
Activations Density 0.010%