INDEX
    Explanations

    names of well-known individuals, particularly celebrities and public figures

    New Auto-Interp
    Negative Logits
    Els
    -0.77
     Cth
    -0.67
    orage
    -0.66
    oard
    -0.64
     Agric
    -0.62
    ework
    -0.61
    ĪĴ
    -0.60
    req
    -0.60
     Sek
    -0.60
     Christie
    -0.59
    POSITIVE LOGITS
     famously
    1.07
     attends
    1.02
     joked
    0.96
     himself
    0.92
     Himself
    0.92
     Jr
    0.91
     greets
    0.90
     tweeted
    0.88
     testified
    0.84
     aka
    0.81
    Act Density 0.236%

    No Known Activations