INDEX
Explanations
words related to family members and relationships
references to familial or maternal themes
New Auto-Interp
Negative Logits
vernment
-0.85
é¾
-0.75
lawy
-0.71
odox
-0.70
constitu
-0.69
ngth
-0.67
iates
-0.67
DoS
-0.66
unda
-0.66
ESSION
-0.66
POSITIVE LOGITS
cup
0.89
moms
0.85
parents
0.84
Gifts
0.84
cot
0.83
cereal
0.81
daughters
0.80
Parents
0.80
hes
0.79
wife
0.78
Activations Density 0.268%