INDEX
    Explanations

    foreign names and words

    New Auto-Interp
    Negative Logits
    an
    1.78
    er
    1.76
    u
    1.65
     for
    1.63
     by
    1.38
    a
    1.34
    as
    1.33
    i
    1.32
    ed
    1.28
    ing
    1.27
    POSITIVE LOGITS
    с
    1.05
    ні
    0.84
    ならない
    0.73
    нын
    0.72
    一種
    0.71
    )
    0.70
    はなく
    0.70
    0.69
    ın
    0.68
     giusto
    0.66
    Act Density 0.095%

    No Known Activations