INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     тверд
    -0.07
     pouch
    -0.07
     hues
    -0.07
    ворення
    -0.07
    mans
    -0.06
    02
    -0.06
     rub
    -0.06
    _sale
    -0.06
     EOS
    -0.06
     Potter
    -0.06
    POSITIVE LOGITS
     baktı
    0.06
     dearly
    0.06
     illeg
    0.06
    aze
    0.06
     richest
    0.05
    onse
    0.05
     RESERVED
    0.05
     khắc
    0.05
     мене
    0.05
    @↵↵
    0.05
    Act Density 0.107%

    No Known Activations