INDEX
    Explanations

    strongly worded declarations of authorial opinions or arguments

    the word "that" indicating arguments or claims

    New Auto-Interp
    Negative Logits
    backer
    -0.80
    stal
    -0.70
    Guard
    -0.67
    mouth
    -0.65
    api
    -0.63
    SEE
    -0.60
    inar
    -0.60
    Champ
    -0.60
    atro
    -0.60
    aq
    -0.60
    POSITIVE LOGITS
     there
    0.75
     although
    0.72
     justifies
    0.69
     preserving
    0.69
     abol
    0.67
     someday
    0.67
     we
    0.65
     whoever
    0.65
     prevailed
    0.65
     "[
    0.64
    Act Density 0.217%

    No Known Activations