INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.06
    .INTER
    -0.06
     recurs
    -0.06
     bans
    -0.06
    _small
    -0.06
    yssey
    -0.06
     Россия
    -0.06
     причина
    -0.06
    -0.06
     سبز
    -0.06
    POSITIVE LOGITS
     exclusively
    0.07
    !!↵
    0.07
    ))):↵
    0.06
    [
    0.06
    oucí
    0.06
    '.↵
    0.06
    :{↵
    0.06
    }))↵
    0.06
    "})↵
    0.06
    ++)↵
    0.06
    Act Density 0.000%

    No Known Activations