INDEX
Explanations
themes related to family structure and parenting
New Auto-Interp
Negative Logits
orm
-0.17
ý
-0.16
ruku
-0.16
ibern
-0.15
uncle
-0.15
راÙĨÙĩ
-0.15
aunt
-0.15
Sibling
-0.15
Fem
-0.14
cousin
-0.14
POSITIVE LOGITS
children
0.54
children
0.45
Children
0.42
kids
0.41
Children
0.41
_children
0.40
.children
0.34
children
0.32
(children
0.31
kids
0.31
Activations Density 0.169%