INDEX
    Explanations

    names of a specific person, possibly from news articles

    New Auto-Interp
    Negative Logits
    */(
    -0.82
    cision
    -0.77
     derogatory
    -0.66
    achers
    -0.64
    urers
    -0.63
    things
    -0.63
    ocratic
    -0.63
    ework
    -0.62
    cipline
    -0.62
    chnology
    -0.62
    POSITIVE LOGITS
    aii
    0.86
     Bei
    0.76
     Sue
    0.75
     Karen
    0.75
     Allen
    0.74
    Anne
    0.74
     Silk
    0.73
     Larson
    0.73
     Bang
    0.70
     Dunham
    0.70
    Act Density 0.020%

    No Known Activations