INDEX
Explanations
instances of the word "mother" with high activation values
references to family members, particularly mothers and fathers
New Auto-Interp
Negative Logits
intensity
-0.72
uters
-0.71
mble
-0.68
atility
-0.68
arcity
-0.67
estyles
-0.66
uers
-0.66
ensitivity
-0.63
ilateral
-0.63
uve
-0.62
POSITIVE LOGITS
hood
0.86
uncle
0.85
ma
0.79
grandmother
0.78
aunt
0.76
ancestor
0.74
niece
0.72
disappro
0.72
mummy
0.71
parents
0.70
Activations Density 0.076%