INDEX
Explanations
references to female characters and their relationships
New Auto-Interp
Negative Logits
istrinya
-0.98
娶
-0.83
his
-0.67
vợ
-0.64
obrigado
-0.61
rád
-0.61
僕は
-0.59
wives
-0.58
his
-0.58
妻子
-0.58
POSITIVE LOGITS
husband
2.13
husband
1.75
Husband
1.61
Husband
1.54
husbands
1.52
boyfriend
1.41
marido
1.38
hubby
1.34
esposo
1.27
marito
1.27
Activations Density 0.341%