INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    athlon
    -0.07
    --------↵↵
    -0.06
    _CON
    -0.06
    .BOTTOM
    -0.06
     emotional
    -0.06
     والم
    -0.06
     Thumbnails
    -0.06
     Not
    -0.06
     этой
    -0.06
     Africa
    -0.06
    POSITIVE LOGITS
     سرو
    0.07
     dcc
    0.06
    aines
    0.06
    oss
    0.06
    _sentence
    0.06
    erti
    0.06
     overseeing
    0.06
     mlad
    0.06
    oron
    0.06
     Flynn
    0.06
    Act Density 0.042%

    No Known Activations