INDEX
    Explanations

    mentions of LGBTQ-related themes

    New Auto-Interp
    Negative Logits
    ëį°
    -0.22
    ร
    -0.20
    न
    -0.19
    ums
    -0.18
    ily
    -0.17
     majority
    -0.17
    ãģ¨ãģĵãĤį
    -0.16
    ष
    -0.15
    ãģĤãĤĬ
    -0.15
    ãģįãģŁ
    -0.15
    POSITIVE LOGITS
    eenth
    0.17
    ————————————————
    0.17
    ed
    0.17
    à¸Ļ
    0.17
    ãģĦãģ¾ãģĻ
    0.16
    chy
    0.15
    ëĭ¤ëĬĶ
    0.15
    edl
    0.15
    /cop
    0.15
    aroo
    0.15
    Act Density 0.200%

    No Known Activations