INDEX
    Explanations

    references to LGBTQ+ themes and terminology, particularly related to gay pride and rights

    New Auto-Interp
    Negative Logits
    cker
    -0.18
     homosexuals
    -0.16
    inus
    -0.15
    _SECURITY
    -0.15
     gays
    -0.15
    ÑģÑĤин
    -0.15
    帯
    -0.14
    Instances
    -0.14
    bjerg
    -0.14
    istrovstvÃŃ
    -0.14
    POSITIVE LOGITS
     rights
    0.28
    dar
    0.28
    -rights
    0.28
    bor
    0.26
     pride
    0.24
    -friendly
    0.23
    atri
    0.23
    lord
    0.22
     Rights
    0.22
     RIGHTS
    0.22
    Act Density 0.025%

    No Known Activations