INDEX
    Explanations

    discussions around socio-political and cultural criticism, particularly related to privilege and systemic issues

    New Auto-Interp
    Negative Logits
    ior
    -0.17
    oot
    -0.15
    ãĥ³ãĤ¹
    -0.15
    .jetbrains
    -0.15
    est
    -0.14
     WA
    -0.14
    emm
    -0.14
    WA
    -0.13
     Fol
    -0.13
     cho
    -0.13
    POSITIVE LOGITS
    xCB
    0.17
    OMIC
    0.16
    ymes
    0.15
     Hüs
    0.15
    defer
    0.15
    deniz
    0.15
    одÑĭ
    0.15
    ddy
    0.15
    elijk
    0.14
    //{{
    0.14
    Act Density 0.006%

    No Known Activations