INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     cùng
    -0.06
     commem
    -0.06
     pady
    -0.06
    girls
    -0.06
     City
    -0.06
     governor
    -0.06
    -0.05
     Validator
    -0.05
    ım
    -0.05
     ابت
    -0.05
    POSITIVE LOGITS
     ');
    0.08
     yerel
    0.07
    .depart
    0.07
    .StartsWith
    0.07
    .Node
    0.07
     }}
    0.07
    errer
    0.07
    وار
    0.07
    (DIR
    0.06
    (a
    0.06
    Act Density 0.000%

    No Known Activations