INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    of
    1.29
    б
    1.24
     on
    1.09
    ك
    1.09
    t
    1.06
    ק
    1.02
    ج
    1.00
    ac
    0.95
    0.91
    з
    0.91
    POSITIVE LOGITS
    I
    1.20
    Sierra
    1.02
     Sierra
    0.89
    O
    0.86
    0.85
    0.83
    F
    0.82
    0.82
    E
    0.82
    K
    0.80
    Act Density 0.001%

    No Known Activations