INDEX
    Explanations

    themes related to systemic inequality and social power dynamics

    New Auto-Interp
    Negative Logits
     attacking
    -0.13
    reau
    -0.13
     Duty
    -0.13
    íļĮ
    -0.13
     deficient
    -0.13
    rarian
    -0.13
    ndl
    -0.13
    INCT
    -0.13
    ourn
    -0.12
    pai
    -0.12
    POSITIVE LOGITS
     rule
    0.43
     control
    0.42
     exercise
    0.39
     controls
    0.39
    control
    0.34
     Controls
    0.34
     RULE
    0.33
     controlling
    0.33
    controls
    0.33
    rule
    0.32
    Act Density 0.371%

    No Known Activations