INDEX
    Explanations

    assignments or comparisons with 1

    New Auto-Interp
    Negative Logits
    M
    0.49
    Me
    0.48
    G
    0.47
    E
    0.45
    K
    0.44
    zenia
    0.43
    Antib
    0.42
    H
    0.41
    L
    0.41
    F
    0.41
    POSITIVE LOGITS
     basé
    0.49
    ק
    0.46
    arc
    0.44
    _"+
    0.44
     Egyptian
    0.43
     segunda
    0.43
    ard
    0.43
    ου
    0.43
     Second
    0.43
    ंदा
    0.43
    Act Density 0.033%

    No Known Activations