INDEX
Explanations
parents and their number of children
phrases indicating familial relationships
New Auto-Interp
Negative Logits
ioxide
-0.66
olid
-0.65
Ĥİ
-0.65
nels
-0.65
rooft
-0.65
krit
-0.62
eele
-0.62
govtrack
-0.62
issions
-0.61
NRS
-0.61
POSITIVE LOGITS
daughters
0.99
twins
0.96
thood
0.92
daughter
0.86
daughter
0.83
sons
0.76
children
0.76
toddlers
0.74
bride
0.73
grandchildren
0.73
Activations Density 0.071%