INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    терес
    -0.07
    oplay
    -0.06
    -0.06
    -0.06
    plemented
    -0.06
    -0.06
     repetitions
    -0.06
     Valerie
    -0.06
     поля
    -0.06
     خداوند
    -0.06
    POSITIVE LOGITS
     dla
    0.07
     quam
    0.07
     gotten
    0.06
    (bind
    0.06
    'LBL
    0.06
    Washington
    0.06
    Þ
    0.06
    celain
    0.06
     ciz
    0.06
     enact
    0.06
    Act Density 0.000%

    No Known Activations