INDEX
    Explanations

    phrases related to consequences or impacts

    New Auto-Interp
    Negative Logits
    ahime
    -0.75
    eg
    -0.73
    ortment
    -0.71
    rongh
    -0.69
    ometimes
    -0.67
    hap
    -0.61
    itton
    -0.60
    eele
    -0.59
    initely
    -0.58
    eworks
    -0.57
    POSITIVE LOGITS
     whatsoever
    2.20
     nor
    1.48
     anymore
    1.20
     except
    1.07
    nor
    1.04
     slightest
    0.96
    soever
    0.91
     anywhere
    0.88
     anybody
    0.86
     anything
    0.85
    Act Density 1.642%

    No Known Activations