INDEX
    Explanations

    endings and conjunctions

    New Auto-Interp
    Negative Logits
    ot
    0.41
    '
    0.40
    os
    0.37
    s
    0.35
    0.32
    es
    0.32
    a
    0.31
    دون
    0.30
    l
    0.30
    1
    0.30
    POSITIVE LOGITS
     or
    0.51
     and
    0.47
    and
    0.43
     be
    0.43
     και
    0.41
     এবং
    0.40
     σε
    0.39
     in
    0.38
     বা
    0.37
     و
    0.35
    Act Density 0.129%

    No Known Activations