INDEX
Explanations
names starting or ending with "Marie"
the name "Marie" in various contexts
New Auto-Interp
Negative Logits
oted
-0.78
orage
-0.78
itious
-0.74
otional
-0.74
ORE
-0.72
ainment
-0.70
raints
-0.69
olding
-0.68
recomm
-0.68
andals
-0.68
POSITIVE LOGITS
Claire
1.00
Marie
0.98
lette
0.90
Marie
0.89
Slaughter
0.84
Louise
0.82
lla
0.82
mosqu
0.77
bilt
0.76
theless
0.70
Activations Density 0.003%