INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     аласыз
    0.45
     თქვენ
    0.42
     calf
    0.42
    hitva
    0.42
    ²',
    0.41
     आईसीआर
    0.41
     पुअनि
    0.40
     lentils
    0.40
     hereto
    0.40
     lạc
    0.39
    POSITIVE LOGITS
     T
    1.48
    T
    1.09
     Т
    0.91
    0.77
     الت
    0.77
     Τ
    0.77
     टी
    0.71
    𝑇
    0.66
     t
    0.64
     R
    0.63
    Act Density 0.007%

    No Known Activations