INDEX
Explanations
names of individuals
mentions of the name "Anne."
New Auto-Interp
Negative Logits
yz
-0.81
ODUCT
-0.75
oted
-0.74
istani
-0.73
ivated
-0.72
ramid
-0.72
notes
-0.72
ornia
-0.71
oter
-0.70
sidx
-0.70
POSITIVE LOGITS
Marie
1.17
Marie
1.14
Hath
1.10
Thatcher
1.05
Louise
0.93
Anne
0.93
Rice
0.89
Anne
0.88
Wynne
0.87
Helen
0.87
Activations Density 0.048%