INDEX
    Explanations

    terms related to LGBTQ+ topics

    New Auto-Interp
    Negative Logits
     ICC
    -0.63
    ERAL
    -0.62
    uania
    -0.61
    ERSON
    -0.60
     bats
    -0.60
    sonian
    -0.60
    iaries
    -0.59
     Hes
    -0.58
    lessly
    -0.58
     TODAY
    -0.57
    POSITIVE LOGITS
    erness
    1.25
    zon
    1.14
    uing
    1.09
    ues
    1.02
    ued
    1.00
    edo
    0.97
    que
    0.96
    asy
    0.94
    ue
    0.87
    enne
    0.86
    Act Density 0.026%

    No Known Activations