INDEX
    Explanations

    "contemporary" + category

    New Auto-Interp
    Negative Logits
    ל
    1.12
    ä
    1.08
    א
    1.06
    ول
    1.05
    1.03
    จะ
    0.91
    ע
    0.88
     Riemannian
    0.85
    0.85
    ри
    0.84
    POSITIVE LOGITS
    t
    2.03
    z
    2.00
    k
    1.97
    x
    1.89
    y
    1.86
    f
    1.68
    s
    1.64
    p
    1.60
    w
    1.58
    r
    1.46
    Act Density 0.007%

    No Known Activations