INDEX
Explanations
phrases that describe familial relationships and parental roles
New Auto-Interp
Negative Logits
e
-0.18
Paginator
-0.15
amin
-0.14
rid
-0.14
gh
-0.14
ins
-0.13
ure
-0.13
sti
-0.13
ew
-0.13
oth
-0.13
POSITIVE LOGITS
ucch
0.17
.gwt
0.14
igers
0.14
ุส
0.14
slain
0.14
ato
0.14
conv
0.14
Král
0.14
šov
0.14
resas
0.14
Activations Density 0.044%