INDEX
    Explanations

    names of people

    proper nouns, particularly names of people and family connections

    New Auto-Interp
    Negative Logits
    tracking
    -0.84
    CONCLUS
    -0.81
     scanners
    -0.72
    pmwiki
    -0.69
    Platform
    -0.66
     incent
    -0.65
     LEVEL
    -0.65
    weapon
    -0.64
     subreddit
    -0.62
     dystop
    -0.60
    POSITIVE LOGITS
    mie
    1.06
     Jr
    1.00
     Sr
    0.98
     Rodham
    0.95
    ilde
    0.91
    nie
    0.90
    hyde
    0.90
     Doe
    0.89
    lynn
    0.89
    abeth
    0.88
    Act Density 0.170%

    No Known Activations