INDEX
    Explanations

    references to Pride celebrations and symbols associated with the LGBTQ+ community

    New Auto-Interp
    Negative Logits
    KHTML
    -0.16
    jang
    -0.15
    /Branch
    -0.15
    istrovstvÃŃ
    -0.15
    emin
    -0.15
     اÙĦذ
    -0.15
    ypress
    -0.15
    389
    -0.14
    utex
    -0.14
    ppo
    -0.14
    POSITIVE LOGITS
     rain
    0.48
    Rain
    0.47
    rain
    0.45
     Rain
    0.45
     Rainbow
    0.42
     rainbow
    0.42
    RAIN
    0.38
     rainfall
    0.31
    bows
    0.31
     spectrum
    0.30
    Act Density 0.074%

    No Known Activations