INDEX
Explanations
positive descriptions of children and their experiences
New Auto-Interp
Negative Logits
illez
-0.16
resher
-0.15
Pregn
-0.15
ä¸Ī夫
-0.15
married
-0.15
celib
-0.15
pregnant
-0.14
ladies
-0.14
ipur
-0.14
Ladies
-0.14
POSITIVE LOGITS
innocence
0.21
learning
0.21
school
0.20
todd
0.19
inher
0.18
learning
0.18
sibling
0.18
prec
0.18
innoc
0.17
siblings
0.17
Activations Density 0.520%