INDEX
    Explanations

    phrases indicating negation or refusal

    negation statements, particularly phrases that include "won't."

    New Auto-Interp
    Negative Logits
    soType
    -0.66
     Floating
    -0.63
    itiz
    -0.60
     linkage
    -0.60
    Kings
    -0.58
     Forums
    -0.58
     Pipeline
    -0.57
    Hardware
    -0.57
     illustration
    -0.56
    edIn
    -0.56
    POSITIVE LOGITS
     necessarily
    0.94
    ardless
    0.85
    itles
    0.81
    apest
    0.80
    ember
    0.79
    rees
    0.78
    payers
    0.76
    urtle
    0.76
    ournament
    0.75
    angular
    0.75
    Act Density 0.035%

    No Known Activations