INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    an
    0.80
    0.73
    ،
    0.68
    ح
    0.67
    ان
    0.66
     on
    0.65
    ش
    0.64
    0.64
    idegg
    0.63
    اج
    0.62
    POSITIVE LOGITS
    k
    0.88
     ו
    0.73
    cd
    0.66
    ס
    0.64
    December
    0.63
    dokument
    0.63
    с
    0.63
    president
    0.62
    c
    0.62
    province
    0.61
    Act Density 0.045%

    No Known Activations