INDEX
Explanations
references to family relationships and parental influence
New Auto-Interp
Negative Logits
Friend
-0.18
friend
-0.17
oun
-0.17
friend
-0.17
Friend
-0.16
nephew
-0.16
friendship
-0.16
riend
-0.15
isbury
-0.15
Girlfriend
-0.14
POSITIVE LOGITS
remar
0.24
abusive
0.17
immigrants
0.17
absentee
0.17
edback
0.15
divorced
0.15
supportive
0.15
IGO
0.15
uell
0.14
antha
0.14
Activations Density 0.102%