INDEX
    Explanations

    proper names or nouns used to describe individuals

    proper nouns, specifically people's names

    New Auto-Interp
    Negative Logits
    pron
    -0.67
    uay
    -0.66
    taboola
    -0.64
    mble
    -0.64
    uminati
    -0.60
    berra
    -0.59
    į
    -0.58
     âĢº
    -0.57
    abwe
    -0.57
    ymm
    -0.57
    POSITIVE LOGITS
    's
    0.70
    kson
    0.65
     herself
    0.65
     wore
    0.62
     swore
    0.60
     saw
    0.59
     Introduced
    0.58
     himself
    0.57
     realised
    0.57
    owan
    0.57
    Act Density 0.249%

    No Known Activations