INDEX
    Explanations

    mentions of specific individuals and their actions or situations

    New Auto-Interp
    Negative Logits
    selves
    -0.93
     unison
    -0.86
     selves
    -0.84
    results
    -0.69
    Recommend
    -0.67
     asses
    -0.66
    OTAL
    -0.66
     collective
    -0.66
     Consumers
    -0.65
    mination
    -0.65
    POSITIVE LOGITS
     himself
    1.66
     Himself
    1.19
     assassinated
    1.06
     his
    1.02
     herself
    0.97
     famously
    0.94
     personally
    0.94
    enegger
    0.89
     resigned
    0.88
     persona
    0.88
    Act Density 8.478%

    No Known Activations