INDEX
    Explanations

    mentions of support for a specific organization or cause

    the presence of the word "that" in various contexts

    New Auto-Interp
    Negative Logits
     Doctrine
    -0.67
     Corps
    -0.62
     Ruk
    -0.62
     Hallow
    -0.61
     Sad
    -0.61
    NES
    -0.60
     Anyway
    -0.59
    raz
    -0.59
     Planning
    -0.59
    riot
    -0.58
    POSITIVE LOGITS
     arose
    1.01
     preceded
    0.93
     comprise
    0.91
     occur
    0.89
     accumulate
    0.89
     weren
    0.88
     arise
    0.88
     resulted
    0.88
     circulate
    0.86
     compose
    0.86
    Act Density 0.191%

    No Known Activations