INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Pai
    -0.07
    ΑΔ
    -0.07
    ่านมา
    -0.07
    атов
    -0.06
     nose
    -0.06
    -0.06
     sắp
    -0.06
     сделать
    -0.06
    าคา
    -0.06
    ű
    -0.06
    POSITIVE LOGITS
    (actor
    0.07
    Ajax
    0.07
    (ht
    0.07
     crea
    0.06
     TEMP
    0.06
    ;a
    0.06
    _published
    0.06
     embarrass
    0.06
    .cent
    0.06
     ISSUE
    0.06
    Act Density 0.008%

    No Known Activations