INDEX
    Explanations

    mentions of power dynamics and political conflicts

    New Auto-Interp
    Negative Logits
    aroo
    -0.16
    emplates
    -0.15
    ̣
    -0.15
    cctor
    -0.15
     Arbeit
    -0.15
    inspace
    -0.14
    ndern
    -0.14
    dete
    -0.14
    amework
    -0.14
    EMPLARY
    -0.14
    POSITIVE LOGITS
     power
    0.23
     political
    0.22
     purge
    0.20
     trait
    0.20
     Political
    0.19
     politically
    0.18
     powerful
    0.18
    loy
    0.17
    faction
    0.17
     palace
    0.17
    Act Density 0.094%

    No Known Activations