INDEX
    Explanations

    references to former professional athletes or individuals associated with sports

    New Auto-Interp
    Negative Logits
     pÃŃsem
    -0.20
    urger
    -0.16
    yal
    -0.15
     Kir
    -0.15
    _LARGE
    -0.15
    :::
    -0.15
    aire
    -0.14
    pter
    -0.14
    957
    -0.14
    regs
    -0.14
    POSITIVE LOGITS
     cel
    0.19
    Cel
    0.18
     mist
    0.17
     vole
    0.17
     Cel
    0.16
     Soup
    0.16
    .dev
    0.16
     MS
    0.16
    ibble
    0.16
     Mist
    0.15
    Act Density 0.006%

    No Known Activations