INDEX
    Explanations

    themes related to power dynamics and accountability

    New Auto-Interp
    Negative Logits
     now
    -0.18
     currently
    -0.17
    缮åīį
    -0.15
     presently
    -0.15
     Currently
    -0.15
    nÄĽl
    -0.15
     Cres
    -0.14
     tonight
    -0.14
     ahora
    -0.14
    ingly
    -0.14
    POSITIVE LOGITS
     oneself
    0.18
     usually
    0.17
     sometimes
    0.15
    ometimes
    0.15
    rts
    0.14
    usually
    0.14
    adero
    0.14
    ÙĪØº
    0.14
    avaÅŁ
    0.14
     gens
    0.14
    Act Density 0.796%

    No Known Activations