INDEX
Explanations
family relationships, specifically those involving in-laws
references to familial relationships, particularly those involving a son-in-law
New Auto-Interp
Negative Logits
nels
-0.66
idelines
-0.64
surface
-0.62
tten
-0.61
IUM
-0.61
Effective
-0.61
DER
-0.59
breeze
-0.57
brink
-0.57
CONTIN
-0.56
POSITIVE LOGITS
Sisters
0.85
nephew
0.85
brother
0.76
cousins
0.72
daughters
0.71
grandson
0.71
brothers
0.71
sibling
0.70
sister
0.70
ée
0.68
Activations Density 0.179%