INDEX
    Explanations

    phrases emphasizing a strong opinion or negation

    emphatic negations and strong disclaimers

    New Auto-Interp
    Negative Logits
    lahoma
    -0.73
    apters
    -0.67
    rity
    -0.67
    urer
    -0.65
    orio
    -0.65
    iary
    -0.65
    liest
    -0.65
    auga
    -0.65
    urers
    -0.65
    ariat
    -0.65
    POSITIVE LOGITS
    LY
    1.57
    ALLY
    1.45
    THING
    1.45
    ELY
    1.42
    ONE
    1.41
    HO
    1.37
    OSE
    1.37
    LESS
    1.36
    NESS
    1.36
     THERE
    1.36
    Act Density 0.158%

    No Known Activations