INDEX
    Explanations

    references to power dynamics and societal structures

    New Auto-Interp
    Negative Logits
    httphttps
    -0.65
    ftagPool
    -0.49
    锈钢
    -0.46
    TagMode
    -0.46
     kaarangay
    -0.44
    tonode
    -0.44
    İstinadlar
    -0.42
    رشف
    -0.40
     שוליים
    -0.39
     cours
    -0.39
    POSITIVE LOGITS
     also
    0.60
     dessutom
    0.51
     inoltre
    0.50
    grunns
    0.48
     også
    0.47
     heller
    0.47
     außerdem
    0.46
    also
    0.46
     також
    0.45
     myös
    0.45
    Act Density 0.924%

    No Known Activations