INDEX
    Explanations

    names of public figures or individuals

    proper nouns, specifically names of people and places

    New Auto-Interp
    Negative Logits
    nesday
    -0.81
    cule
    -0.78
    ationally
    -0.78
    ATIONAL
    -0.77
     Pats
    -0.75
    onial
    -0.74
     GI
    -0.69
    oso
    -0.68
    atile
    -0.67
    alde
    -0.66
    POSITIVE LOGITS
     Kejriwal
    1.05
    jriwal
    0.94
    Keefe
    0.76
     Clarkson
    0.73
    eters
    0.73
    ly
    0.71
    agher
    0.69
    lers
    0.69
    loo
    0.68
    zeb
    0.67
    Act Density 0.009%

    No Known Activations