INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    M
    1.48
    T
    1.20
    S
    1.13
    K
    1.13
    W
    1.08
    V
    1.05
    O
    1.04
    OCH
    1.03
    F
    0.97
    H
    0.96
    POSITIVE LOGITS
    на
    1.35
    ת
    1.26
    ية
    1.23
    1.20
    ان
    1.10
    ัล
    1.09
    ਾਂ
    1.09
    ى
    1.09
    ىر
    1.06
    ala
    1.05
    Act Density 0.027%

    No Known Activations