INDEX
    Explanations

    phrases related to expressing opinions or giving warnings

    statements and warnings regarding social and political issues

    New Auto-Interp
    Negative Logits
    ecast
    -0.89
    ocument
    -0.82
    ixture
    -0.73
    tyard
    -0.71
    ixt
    -0.68
    adena
    -0.67
    efficients
    -0.64
    ocaust
    -0.64
    andem
    -0.64
    etrical
    -0.62
    POSITIVE LOGITS
     accordingly
    0.96
     afterward
    0.71
     encour
    0.70
     sarcast
    0.70
     stressing
    0.69
     furthermore
    0.68
     vowed
    0.67
     nonetheless
    0.66
     blaming
    0.66
     rhet
    0.64
    Act Density 0.318%

    No Known Activations