INDEX
    Explanations

    instances of political commentary or critique

    New Auto-Interp
    Negative Logits
    341
    -0.17
    chalk
    -0.15
    ascade
    -0.14
    erli
    -0.14
    ials
    -0.14
    cliffe
    -0.14
    scape
    -0.14
    eph
    -0.14
    족
    -0.13
    opc
    -0.13
    POSITIVE LOGITS
     cent
    0.15
     Bak
    0.15
     Cent
    0.15
     symbolic
    0.14
    ILT
    0.14
     اتØŃاد
    0.14
     reception
    0.14
    rea
    0.14
     Kore
    0.13
    aram
    0.13
    Act Density 0.167%

    No Known Activations