INDEX
Explanations
references to daughters and female offspring
New Auto-Interp
Negative Logits
esgue
-0.88
defaultstate
-0.86
المناصب
-0.83
kus
-0.77
loys
-0.72
."],
-0.72
OfWork
-0.71
tetra
-0.71
sätzlich
-0.70
✨:
-0.70
POSITIVE LOGITS
daughters
1.38
filles
1.24
daughter
1.23
UGHTER
1.21
daughter
1.10
Daughter
1.08
Daughter
1.07
girls
1.04
Daughters
1.03
datter
0.97
Activations Density 0.057%