INDEX
    Explanations

    mentions of individuals, particularly names or handles associated with social media or public figures

    New Auto-Interp
    Negative Logits
    ropa
    -0.16
    site
    -0.15
    geme
    -0.14
     Bare
    -0.14
    MMdd
    -0.14
    VID
    -0.14
    erdale
    -0.14
     Hague
    -0.14
     SPORT
    -0.14
    _ajax
    -0.13
    POSITIVE LOGITS
    bidden
    0.18
    antan
    0.16
    nesc
    0.15
     TaÅŁ
    0.15
    builtin
    0.15
    ADDE
    0.14
    ña
    0.14
    ï¸
    0.14
    нок
    0.14
    alth
    0.14
    Act Density 0.073%

    No Known Activations