INDEX
    Explanations

    names and surnames, particularly those with specific endings

    New Auto-Interp
    Negative Logits
    amer
    -0.17
    leton
    -0.17
    ermann
    -0.17
    ichi
    -0.16
    emer
    -0.15
    .sul
    -0.15
    ibal
    -0.15
     Mutation
    -0.14
    alu
    -0.14
    ponce
    -0.14
    POSITIVE LOGITS
    rosso
    0.15
     hale
    0.14
    VID
    0.14
    .simps
    0.14
    rub
    0.14
     Tut
    0.14
     bud
    0.13
     Aub
    0.13
     sơ
    0.13
    ++.
    0.13
    Act Density 0.004%

    No Known Activations