INDEX
    Explanations

    themes related to control and power dynamics, particularly in social contexts

    New Auto-Interp
    Negative Logits
    ezi
    -0.17
    rix
    -0.16
    lege
    -0.16
    ạch
    -0.15
    letic
    -0.14
    iasi
    -0.14
     McGregor
    -0.14
    ÑĩеÑģкое
    -0.14
     haze
    -0.14
    uw
    -0.14
    POSITIVE LOGITS
    ñana
    0.16
    ód
    0.16
     /*#__
    0.15
    quets
    0.15
    å®Ī
    0.15
    clud
    0.14
    ictim
    0.14
    OrCreate
    0.13
     mot
    0.13
     flo
    0.13
    Act Density 0.065%

    No Known Activations