INDEX
    Explanations

    abstract methods and classes

    New Auto-Interp
    Negative Logits
    ل
    0.82
    ك
    0.79
    л
    0.75
    ки
    0.63
    он
    0.61
    0.61
    ли
    0.60
    कर
    0.59
    ार्क
    0.59
    كر
    0.59
    POSITIVE LOGITS
    t
    1.10
    y
    1.01
    o
    1.00
    ing
    0.91
    al
    0.84
    e
    0.84
    a
    0.80
    i
    0.80
    !")
    0.77
    r
    0.72
    Act Density 0.001%

    No Known Activations