INDEX
Explanations
references to familial relationships and family dynamics
New Auto-Interp
Negative Logits
aunt
-0.19
uncle
-0.16
oldt
-0.16
fian
-0.16
dorf
-0.15
udev
-0.15
astle
-0.14
oÅĻ
-0.14
elders
-0.14
iÅŁlem
-0.14
POSITIVE LOGITS
son
0.50
sons
0.50
grandson
0.46
daughter
0.45
granddaughter
0.44
daughters
0.43
grandchildren
0.41
Daughter
0.37
Sons
0.36
Son
0.36
Activations Density 0.321%