INDEX
    Explanations

    references to the word "Gay" - potentially indicating a focus on a specific person or topic named "Gay"

    New Auto-Interp
    Negative Logits
    KT
    -0.84
    ocobo
    -0.81
    è¦ļéĨĴ
    -0.79
    éĹĺ
    -0.77
    âĵĺ
    -0.74
    arily
    -0.74
    ACTED
    -0.72
    aeda
    -0.72
    etsk
    -0.70
    igslist
    -0.70
    POSITIVE LOGITS
    atri
    1.00
    bian
    0.97
     Gay
    0.95
    lord
    0.94
    dar
    0.92
    lyn
    0.89
    zilla
    0.85
    nor
    0.85
    ety
    0.84
    cott
    0.83
    Act Density 0.005%

    No Known Activations