INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ار
    1.02
    <0x80>
    0.94
    il
    0.89
    ol
    0.89
    ar
    0.87
    ۔
    0.87
    ні
    0.86
    ۰
    0.81
    iin
    0.79
    ară
    0.78
    POSITIVE LOGITS
    ED
    1.12
    -
    1.02
    ă
    0.93
    ER
    0.83
     ב
    0.82
     to
    0.82
     rates
    0.78
    ك
    0.78
    ES
    0.77
            
    0.77
    Act Density 0.001%

    No Known Activations