INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    و
    1.03
    ات
    0.99
    lari
    0.95
    ین
    0.91
    را
    0.91
    0.90
    ્સ
    0.90
    0.86
    ной
    0.85
    vät
    0.84
    POSITIVE LOGITS
    ,
    0.77
    A
    0.73
     e
    0.71
     A
    0.71
    I
    0.69
     R
    0.67
     I
    0.66
    AN
    0.66
    }{
    0.64
    IN
    0.62
    Act Density 1.230%

    No Known Activations