INDEX
    Explanations

    recognizing query beginnings

    New Auto-Interp
    Negative Logits
     on
    1.51
     is
    1.26
     has
    1.24
    a
    1.23
    ل
    1.20
     at
    1.06
    at
    1.05
    ik
    0.94
    々の
    0.94
    o
    0.92
    POSITIVE LOGITS
     anglais
    1.04
    .
    0.95
    ORE
    0.94
    ט
    0.93
    0.91
    0.88
    ской
    0.82
    0.80
    0.80
     ຫຼື
    0.79
    Act Density 0.950%

    No Known Activations