INDEX
    Explanations

    mentions of love and romantic relationships

    New Auto-Interp
    Negative Logits
     Administrativna
    -0.69
     للمعارف
    -0.52
    Hentet
    -0.51
    PyExc
    -0.51
    ########.
    -0.51
     utafitiHapana
    -0.49
    ロウィン
    -0.47
    ophanes
    -0.47
     manna
    -0.47
    ցված
    -0.46
    POSITIVE LOGITS
     couples
    0.72
     romantic
    0.72
     romance
    0.68
     marriage
    0.68
     dating
    0.67
     Dating
    0.67
     Couples
    0.66
    Dating
    0.64
    💏
    0.60
     Marriage
    0.59
    Act Density 0.601%

    No Known Activations