INDEX
    Explanations

    words related to people or persons

    references to individuals or groups of people

    New Auto-Interp
    Negative Logits
    shire
    -0.68
    picture
    -0.67
    enegger
    -0.66
    DEV
    -0.64
     Pulse
    -0.63
    enance
    -0.62
     diplom
    -0.62
     Beir
    -0.61
     EDITION
    -0.61
    OME
    -0.59
    POSITIVE LOGITS
    gging
    1.16
    eking
    1.15
    ggy
    1.14
    eps
    1.13
    formance
    1.07
    gged
    1.05
    eping
    1.04
    asant
    1.03
    ptic
    1.02
    pperc
    1.01
    Act Density 0.015%

    No Known Activations