INDEX
    Explanations

    b, br, bx followed by specific characters

    New Auto-Interp
    Negative Logits
    ل
    0.75
    R
    0.71
    ש
    0.68
    Y
    0.66
    The
    0.64
    л
    0.64
    H
    0.63
    G
    0.63
    P
    0.63
    ל
    0.62
    POSITIVE LOGITS
    kort
    0.56
     is
    0.54
    ция
    0.53
    ి
    0.52
    0.51
    коли
    0.50
    0.50
    зки
    0.50
     Organizing
    0.50
     जिसके
    0.49
    Act Density 0.482%

    No Known Activations