INDEX
    Explanations

    mentions of specific names or terms related to LGBTQ topics and vague legal language

    followed by "g" in the middle

    activity or ending in gy

    New Auto-Interp
    Negative Logits
     nahilalakip
    -0.68
     ویکی‌پدی
    -0.65
     незавершена
    -0.62
    ::::::::
    -0.61
    :✨
    -0.61
    PerformLayout
    -0.60
     Terraria
    -0.59
     []:
    -0.59
    StoryboardSegue
    -0.59
    awaiter
    -0.58
    POSITIVE LOGITS
    ging
    0.78
    ged
    0.71
    gs
    0.70
    gy
    0.70
    ggg
    0.68
    gers
    0.62
    gggg
    0.61
    gings
    0.60
    ges
    0.60
    GER
    0.60
    Act Density 0.937%

    No Known Activations