INDEX
    Explanations

    assertive statements or claims within a text

    phrases that indicate claims or statements about specific events or actions

    New Auto-Interp
    Negative Logits
    english
    -0.84
    Laughs
    -0.82
    byss
    -0.81
    register
    -0.79
    rex
    -0.76
    vc
    -0.76
    mmmm
    -0.75
    utical
    -0.74
    erenn
    -0.73
    EStream
    -0.73
    POSITIVE LOGITS
     they
    0.88
     ousted
    0.85
     Saddam
    0.83
     he
    0.82
     hackers
    0.82
     Barack
    0.79
     millions
    0.78
     President
    0.77
     she
    0.76
     Hillary
    0.75
    Act Density 0.217%

    No Known Activations