INDEX
    Explanations

    Names, locations, foreign words

    New Auto-Interp
    Negative Logits
     JE
    -0.09
    -0.08
    -Se
    -0.08
    me
    -0.08
    ge
    -0.08
     Ade
    -0.07
     ende
    -0.07
     Arte
    -0.07
     Fro
    -0.07
    ie
    -0.07
    POSITIVE LOGITS
    an
    0.13
    AN
    0.12
    man
    0.12
    a
    0.12
    A
    0.12
    MAN
    0.10
    >An
    0.10
    ian
    0.10
    al
    0.10
     Ivan
    0.09
    Act Density 0.334%

    No Known Activations