INDEX
Explanations
mentions of the word "wife"
mentions of the word "wife."
New Auto-Interp
Negative Logits
etting
-0.80
oday
-0.77
umbn
-0.76
aneously
-0.75
orescent
-0.74
osta
-0.73
kefeller
-0.72
Flavoring
-0.70
oresc
-0.68
obyl
-0.67
POSITIVE LOGITS
wife
1.13
hood
0.88
fol
0.79
friend
0.78
wife
0.75
women
0.75
doctor
0.74
shake
0.74
Esther
0.73
Wife
0.73
Activations Density 0.023%