INDEX
    Explanations

    sentences related to political and social issues

    New Auto-Interp
    Negative Logits
    hement
    -0.70
    regon
    -0.64
    avery
    -0.64
    taboola
    -0.63
    utherland
    -0.62
    mbuds
    -0.61
    xes
    -0.61
    ipped
    -0.61
    ttes
    -0.60
    wright
    -0.60
    POSITIVE LOGITS
     happening
    0.85
    piring
    0.73
    natureconservancy
    0.67
     transpired
    0.66
     difference
    0.64
     happen
    0.64
     motivating
    0.62
     bothering
    0.62
     grunt
    0.62
     dstg
    0.61
    Act Density 0.171%

    No Known Activations