INDEX
Explanations
mentions or descriptions of daughters
references to daughters
New Auto-Interp
Negative Logits
kefeller
-0.87
constitu
-0.81
ilitarian
-0.72
uchin
-0.70
dstg
-0.69
hent
-0.69
Tribunal
-0.67
psey
-0.67
etting
-0.66
vernment
-0.66
POSITIVE LOGITS
Ivanka
0.96
hood
0.82
Louise
0.81
daughter
0.80
Anne
0.79
Isabel
0.77
daughters
0.77
girl
0.73
Hannah
0.73
Daughter
0.71
Activations Density 0.016%