INDEX
    Explanations

    proper nouns related to significant individuals or entities

    New Auto-Interp
    Negative Logits
    elle
    -0.21
    eh
    -0.21
    ess
    -0.21
    ex
    -0.20
    essa
    -0.19
    ine
    -0.18
    alls
    -0.17
    esser
    -0.17
    ev
    -0.17
    els
    -0.17
    POSITIVE LOGITS
    boro
    0.19
    orraine
    0.19
    abeled
    0.19
    alu
    0.18
    isle
    0.17
    toi
    0.17
    uster
    0.17
    homme
    0.17
    ÃŃky
    0.17
    ighth
    0.17
    Act Density 0.067%

    No Known Activations