INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     for
    1.23
     as
    1.23
    ere
    1.21
    for
    1.20
    re
    1.19
     and
    1.12
    it
    1.10
    ed
    1.09
    m
    1.08
    inin
    1.02
    POSITIVE LOGITS
    та
    1.85
    на
    1.79
    ري
    1.42
    س
    1.26
    א
    1.23
     في
    1.21
    1.21
    ні
    1.20
    наў
    1.16
    كَ
    1.12
    Act Density 0.000%

    No Known Activations