INDEX
Explanations
references to romantic relationships and interactions between partners
mentions of romantic relationships, specifically focusing on the term "boyfriend."
New Auto-Interp
Negative Logits
mble
-0.84
XP
-0.82
pmwiki
-0.80
uchin
-0.79
Downloadha
-0.77
ichen
-0.73
Printed
-0.72
urgical
-0.71
é¾
-0.68
Nanto
-0.68
POSITIVE LOGITS
boyfriend
0.92
friend
0.88
girlfriend
0.81
partner
0.81
husband
0.77
friends
0.77
hood
0.75
rities
0.73
volent
0.72
ships
0.71
Activations Density 0.008%