INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    l
    1.00
    س
    1.00
    barn
    0.97
    eper
    0.93
    lig
    0.89
    ltr
    0.88
    0.88
    ا
    0.87
    align
    0.84
    сти
    0.84
    POSITIVE LOGITS
    quoise
    1.13
    1.07
    ोहित
    1.01
     вій
    0.99
    ফান
    0.96
     disadvant
    0.94
    ಶಾ
    0.93
     advant
    0.92
    selValue
    0.91
     लेसन
    0.91
    Act Density 0.107%

    No Known Activations