INDEX
    Explanations

    references to romantic relationships and interactions between partners

    mentions of romantic relationships, specifically focusing on the term "boyfriend."

    New Auto-Interp
    Negative Logits
    mble
    -0.84
    XP
    -0.82
    pmwiki
    -0.80
    uchin
    -0.79
    Downloadha
    -0.77
    ichen
    -0.73
     Printed
    -0.72
    urgical
    -0.71
    é¾
    -0.68
     Nanto
    -0.68
    POSITIVE LOGITS
     boyfriend
    0.92
    friend
    0.88
     girlfriend
    0.81
     partner
    0.81
    husband
    0.77
    friends
    0.77
    hood
    0.75
    rities
    0.73
    volent
    0.72
    ships
    0.71
    Act Density 0.008%

    No Known Activations