INDEX
    Explanations

    phrases prompting or advising against a certain action

    imperatives and negative commands or suggestions

    New Auto-Interp
    Negative Logits
    ELD
    -0.76
    Redd
    -0.70
    liner
    -0.68
    dimension
    -0.63
    prof
    -0.63
    ħĭ
    -0.62
    established
    -0.61
    upon
    -0.61
    milo
    -0.61
    ilage
    -0.61
    POSITIVE LOGITS
     underestimate
    0.89
     hesitate
    0.86
    theless
    0.86
    ndum
    0.82
    heny
    0.78
     kidding
    0.76
    ardless
    0.74
     worry
    0.74
    omsday
    0.74
    ations
    0.72
    Act Density 0.057%

    No Known Activations