INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    рукт
    -0.06
    inea
    -0.06
     ministries
    -0.06
    icing
    -0.06
    dda
    -0.06
    -sur
    -0.06
    regexp
    -0.06
    lier
    -0.06
    -flag
    -0.06
    'u
    -0.06
    POSITIVE LOGITS
           
    0.07
     chvíli
    0.07
    0.07
    (layers
    0.07
     ÜNİVERSİTESİ
    0.06
    ITED
    0.06
     viv
    0.06
     cleanliness
    0.06
    هم
    0.06
     stopwatch
    0.06
    Act Density 0.000%

    No Known Activations