INDEX
    Explanations

    terms related to romantic relationships and couple activities

    romantic date, couples, honeymoon, love partner

    New Auto-Interp
    Negative Logits
     AssemblyCompany
    -0.85
     AssemblyCulture
    -0.81
    <unused42>
    -0.81
    <unused23>
    -0.81
    <unused79>
    -0.81
    <unused41>
    -0.80
    <unused43>
    -0.80
    <unused16>
    -0.80
    <pad>
    -0.80
    <unused8>
    -0.80
    POSITIVE LOGITS
     romantic
    0.76
     honeymoon
    0.60
     Romantic
    0.60
     couples
    0.56
    Romantic
    0.52
     romantis
    0.50
    romantic
    0.48
     Honeymoon
    0.48
     Couples
    0.47
     romance
    0.45
    Act Density 0.009%

    No Known Activations