INDEX
    Explanations

    punctuations and conjunctions

    New Auto-Interp
    Negative Logits
    and
    0.42
    an
    0.41
    t
    0.39
    u
    0.39
    0.36
    ar
    0.35
    al
    0.33
    er
    0.32
    it
    0.32
    -
    0.32
    POSITIVE LOGITS
     étaient
    0.32
    íamos
    0.31
     impuestos
    0.30
     ataques
    0.30
     ešte
    0.30
     ؟
    0.29
     kucing
    0.29
     erano
    0.29
     jakiś
    0.29
     idk
    0.29
    Act Density 0.000%

    No Known Activations