INDEX
Explanations
names, specifically the name "Marie"
mentions of specific names, particularly "Marie" and "Pierre"
New Auto-Interp
Negative Logits
ODUCT
-0.86
oted
-0.81
ORE
-0.79
otional
-0.76
ifiers
-0.76
itious
-0.75
rawling
-0.75
ifying
-0.74
atable
-0.73
arta
-0.73
POSITIVE LOGITS
Claire
0.97
lla
0.86
Slaughter
0.84
fen
0.78
lette
0.76
Marie
0.72
Louise
0.69
Ples
0.67
bilt
0.66
Cur
0.66
Activations Density 0.017%