INDEX
    Explanations

    phrases related to negative consequences or outcomes

    phrases indicating outcomes or consequences

    New Auto-Interp
    Negative Logits
    hid
    -0.67
     doubtless
    -0.60
     Doodle
    -0.58
     THREE
    -0.58
     sandwic
    -0.58
     assorted
    -0.57
    Origin
    -0.56
    pes
    -0.56
     aliases
    -0.56
    bryce
    -0.55
    POSITIVE LOGITS
     anymore
    1.21
     meaningful
    1.12
     satisfactory
    1.04
     sufficient
    1.04
    acea
    1.02
     lasting
    1.00
     adequate
    0.98
     anything
    0.95
     substantive
    0.93
     any
    0.92
    Act Density 0.672%

    No Known Activations