INDEX
Explanations
references to the name "Sarah."
mentions of the name "Sarah."
New Auto-Interp
Negative Logits
awaru
-1.08
ebin
-0.91
nomine
-0.81
omething
-0.81
ribution
-0.80
inho
-0.79
chwitz
-0.78
OWER
-0.77
eu
-0.75
ega
-0.70
POSITIVE LOGITS
Palin
1.22
Jane
0.95
Sarah
0.92
Chal
0.89
Sarah
0.85
Connor
0.85
Jessica
0.84
Sina
0.83
Kate
0.82
Michelle
0.82
Activations Density 0.011%