INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ע
    1.24
    е
    1.16
    ف
    1.16
    بي
    1.09
    1.08
    ва
    1.06
    ב
    1.02
    го
    1.02
    IN
    0.99
    ре
    0.99
    POSITIVE LOGITS
     is
    1.20
    in
    1.15
    ről
    1.13
    }))
    1.11
     come
    1.06
     on
    1.05
    kgs
    1.04
    "
    1.04
     with
    1.03
    r
    0.99
    Act Density 0.045%

    No Known Activations