INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     densely
    -0.09
    sta
    -0.07
    gen
    -0.07
    -0.07
     respective
    -0.07
    Gen
    -0.07
     disclaim
    -0.07
    ره
    -0.07
    eri
    -0.07
     flood
    -0.07
    POSITIVE LOGITS
     klientów
    0.08
    0.08
    0.07
     dotycz
    0.07
     partijen
    0.07
     devotees
    0.07
    —is
    0.07
     клиента
    0.07
     menuju
    0.07
     Gibt
    0.07
    Act Density 0.005%

    No Known Activations