INDEX
Explanations
references to male siblings
references to familial relationships, specifically focusing on brothers and sisters
New Auto-Interp
Negative Logits
idine
-0.64
orie
-0.64
oval
-0.63
IV
-0.62
ebin
-0.62
ORE
-0.62
employment
-0.62
itation
-0.61
ģ
-0.61
ved
-0.61
POSITIVE LOGITS
brothers
1.13
Brothers
1.04
hips
1.01
hip
1.00
sisters
0.87
pins
0.85
hift
0.83
omething
0.82
tones
0.80
pace
0.79
Activations Density 0.015%