INDEX
Explanations
references to mothers and maternal figures
New Auto-Interp
Negative Logits
Danilo
-0.77
](#
-0.75
ardust
-0.69
SuppressMessage
-0.69
Arif
-0.66
capaian
-0.66
viato
-0.65
==",
-0.64
Pickles
-0.64
Arundel
-0.63
POSITIVE LOGITS
mother
2.60
Mother
2.52
mothers
2.39
Mother
2.38
MOTHER
2.36
mother
2.32
Mothers
2.25
MOTHER
2.17
Mothers
2.14
madre
1.89
Activations Density 0.036%