INDEX
Explanations
occurrences of the name "Marie" in various contexts
New Auto-Interp
Negative Logits
oted
-0.80
okin
-0.77
iating
-0.73
iances
-0.73
=-=-=-=-=-=-=-=-
-0.73
ODUCT
-0.73
iction
-0.72
ellation
-0.72
igation
-0.72
dfx
-0.71
POSITIVE LOGITS
Claire
1.04
lla
0.97
Louise
0.93
Marie
0.93
Anne
0.90
Slaughter
0.85
Marie
0.82
Thatcher
0.81
Anne
0.79
anne
0.78
Activations Density 0.002%