INDEX
    Explanations

    used after agreement or confirmation

    New Auto-Interp
    Negative Logits
    2.02
    ب
    1.30
    ס
    1.23
    ia
    1.14
    ك
    1.14
    ם
    1.14
    ך
    1.12
    ,
    1.11
    a
    1.05
     that
    1.05
    POSITIVE LOGITS
    I
    1.32
    for
    1.31
    ind
    1.25
    ib
    1.06
    ot
    1.05
    W
    1.03
    Τ
    1.03
    they
    1.02
    F
    1.02
    X
    1.02
    Act Density 0.000%

    No Known Activations