INDEX
Explanations
references to relationships and marital dynamics
New Auto-Interp
Negative Logits
Husband
-0.33
boyfriend
-0.32
husbands
-0.29
husband
-0.25
ä¸Ī夫
-0.24
hubby
-0.23
мÑĥжÑĩин
-0.22
мÑĥж
-0.20
muž
-0.19
aida
-0.19
POSITIVE LOGITS
wife
0.80
Wife
0.67
wife
0.63
-wife
0.57
妻
0.56
wives
0.56
esposa
0.43
vợ
0.42
vrouw
0.38
夫人
0.37
Activations Density 0.188%