INDEX
    Explanations

    names of famous personalities

    mentions of specific individuals' names

    New Auto-Interp
    Negative Logits
    mine
    -0.97
    aic
    -0.93
    osition
    -0.88
    icular
    -0.85
    lisher
    -0.85
    joined
    -0.84
    ri
    -0.84
    rity
    -0.84
    minist
    -0.83
    iated
    -0.82
    POSITIVE LOGITS
     Wallace
    0.90
     Hayes
    0.77
     Ellis
    0.76
     Stevens
    0.75
     Strait
    0.72
     Williams
    0.69
     Hole
    0.69
     Owens
    0.66
     Waters
    0.64
     Tyson
    0.64
    Act Density 0.109%

    No Known Activations