INDEX
    Explanations

    affectionate terms and references to shortcomings

    New Auto-Interp
    Negative Logits
    selaer
    -0.56
    Vietnam
    -0.53
    URBANA
    -0.52
    Kanye
    -0.52
    iastes
    -0.51
     ویکی
    -0.51
     DNC
    -0.51
     Bihar
    -0.50
     bmx
    -0.50
     Process
    -0.49
    POSITIVE LOGITS
     Darling
    1.84
    Darling
    1.79
     darling
    1.74
     sweetheart
    0.79
     dearest
    0.77
    Dearest
    0.71
     Adorable
    0.63
     adorable
    0.62
    dear
    0.61
     querida
    0.61
    Act Density 0.002%

    No Known Activations