INDEX
Explanations
mentions of girlfriends
mentions of the word "girlfriend."
New Auto-Interp
Negative Logits
wise
-0.67
arth
-0.63
iers
-0.63
Chart
-0.62
otin
-0.60
.–
-0.59
Chart
-0.59
Component
-0.59
sbm
-0.59
Actions
-0.59
POSITIVE LOGITS
girlfriend
3.42
girlfriend
2.77
girlfriends
2.41
boyfriend
2.28
fiance
2.17
wife
2.01
fian
2.01
roommate
1.90
irlfriend
1.71
mistress
1.67
Activations Density 0.011%