INDEX
    Explanations

    a or an followed by a word

    New Auto-Interp
    Negative Logits
    <bos>
    -1.95
    .
    -1.82
     to
    -1.76
    rijving
    -1.70
     anderem
    -1.63
    undidad
    -1.63
     and
    -1.56
    ,\,\
    -1.51
    𝘴
    -1.50
     dahin
    -1.47
    POSITIVE LOGITS
    AS
    1.73
     emballage
    1.66
    込む
    1.54
    ifiers
    1.52
    alians
    1.51
    1.49
    ナイキ
    1.48
     icona
    1.48
    1.45
     -
    1.43
    Act Density 0.341%

    No Known Activations