INDEX
    Explanations

    words related to romantic relationships

    references to romantic themes and relationships

    New Auto-Interp
    Negative Logits
    upon
    -0.89
    avis
    -0.81
    ldon
    -0.81
    aston
    -0.77
    Reviewer
    -0.77
    uckles
    -0.75
    irin
    -0.72
    aver
    -0.71
    ifted
    -0.71
    avers
    -0.70
    POSITIVE LOGITS
     romantic
    1.15
     monog
    0.94
     romance
    0.84
    antically
    0.83
     Romance
    0.80
     Romantic
    0.77
    ized
    0.77
     erotic
    0.76
    ties
    0.75
     consensual
    0.73
    Act Density 0.006%

    No Known Activations