INDEX
    Explanations

    names of places or names of people

    New Auto-Interp
    Negative Logits
    <bos>
    -1.30
     그것
    -0.79
     나는
    -0.73
     for
    -0.71
     just
    -0.71
     in
    -0.70
     자신의
    -0.69
     at
    -0.69
     책
    -0.68
     within
    -0.68
    POSITIVE LOGITS
     embra
    1.81
     effe
    1.80
     dispen
    1.80
     alkoh
    1.76
     abnorm
    1.72
     pessi
    1.71
     bett
    1.69
     kram
    1.69
     simplif
    1.68
     wien
    1.68
    Act Density 0.297%

    No Known Activations