INDEX
    Explanations

    references to LGBTQ+ pride and activism

    New Auto-Interp
    Negative Logits
    elo
    -0.16
     homosexuals
    -0.15
    aden
    -0.15
    olo
    -0.14
    aret
    -0.14
    olon
    -0.14
    etus
    -0.14
     Verd
    -0.14
    istrovstvÃŃ
    -0.14
    .Mask
    -0.14
    POSITIVE LOGITS
     rights
    0.25
     pride
    0.23
    IQ
    0.22
    -rights
    0.22
    -friendly
    0.20
    QQ
    0.20
     Rights
    0.20
    _rights
    0.20
     community
    0.18
     Pride
    0.18
    Act Density 0.024%

    No Known Activations