INDEX
    Explanations

    references to the word "Que" and its variations, indicating a focus on LGBTQ+ themes

    New Auto-Interp
    Negative Logits
    tha
    -0.17
    igid
    -0.15
    ont
    -0.15
    shop
    -0.15
    son
    -0.14
    phin
    -0.14
    aret
    -0.14
    rade
    -0.14
    sWith
    -0.14
    adies
    -0.14
    POSITIVE LOGITS
    ixer
    0.18
    ijo
    0.17
    ued
    0.17
    ens
    0.17
    estion
    0.16
    iro
    0.16
    iros
    0.16
     Que
    0.16
    jas
    0.15
    366
    0.15
    Act Density 0.008%

    No Known Activations