INDEX
    Explanations

    explicit disclaimers in a text

    assertions of opinion or truthfulness related to content

    New Auto-Interp
    Negative Logits
    rants
    -0.70
    colm
    -0.70
    vati
    -0.70
    eness
    -0.67
    Roberts
    -0.66
    worthiness
    -0.66
    luaj
    -0.65
    LOS
    -0.65
    racuse
    -0.63
    hung
    -0.63
    POSITIVE LOGITS
     approximate
    1.07
     unofficial
    1.03
     NOT
    1.01
     purely
    0.99
     subjective
    0.98
     tentative
    0.97
     fictitious
    0.95
     provisional
    0.94
     strictly
    0.91
     preliminary
    0.91
    Act Density 0.191%

    No Known Activations