INDEX
    Explanations

    phrases that indicate actions taken by populations or groups related to expressing dissatisfaction or taking control

    New Auto-Interp
    Negative Logits
    ^(@)
    -0.73
    ſelves
    -0.72
     Reſ
    -0.65
     $_"
    -0.64
    IBLIO
    -0.64
     itſelf
    -0.64
     leſs
    -0.62
     Majefty
    -0.60
    ſelf
    -0.60
    ENEFITS
    -0.60
    POSITIVE LOGITS
     taking
    0.99
    Taking
    0.98
     Taking
    0.95
     taken
    0.95
     TAKEN
    0.86
     take
    0.84
     Take
    0.84
    taken
    0.82
     takes
    0.81
    taking
    0.80
    Act Density 0.224%

    No Known Activations