INDEX
Explanations
mentions of family relationships, particularly focusing on daughters
mentions of family relationships, specifically focusing on daughters
New Auto-Interp
Negative Logits
kefeller
-0.84
constitu
-0.76
dstg
-0.75
ilitarian
-0.68
uchin
-0.66
etting
-0.65
ioxide
-0.64
hent
-0.63
ypes
-0.63
vernment
-0.63
POSITIVE LOGITS
Ivanka
0.87
daughter
0.85
Anne
0.80
daughters
0.80
Isabel
0.78
daughter
0.78
Louise
0.77
hood
0.77
Maria
0.76
ishly
0.76
Activations Density 0.015%