INDEX
    Explanations

    words related to surnames or last names

    words related to various forms of "le" or "re" morphology

    New Auto-Interp
    Negative Logits
    REDACTED
    -0.75
     Tuc
    -0.70
     Annotations
    -0.70
     mastering
    -0.69
     employing
    -0.64
     Samar
    -0.63
    !/
    -0.63
     Maid
    -0.62
     Dollars
    -0.62
     Centauri
    -0.61
    POSITIVE LOGITS
    etooth
    1.02
    eps
    0.93
    chers
    0.92
    angle
    0.91
    pter
    0.89
    uler
    0.89
    ching
    0.87
    pling
    0.85
    ut
    0.85
    angles
    0.85
    Act Density 0.110%

    No Known Activations