INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    in
    1.02
    ার
    0.73
    inę
    0.73
    er
    0.68
    an
    0.68
    inney
    0.66
    that
    0.64
    نر
    0.64
    et
    0.63
    یر
    0.63
    POSITIVE LOGITS
    ب
    0.75
    ق
    0.67
    이었
    0.63
    W
    0.63
    '(
    0.60
     secretaries
    0.60
    0.59
    پ
    0.57
     lenders
    0.56
    ING
    0.55
    Act Density 0.013%

    No Known Activations