INDEX
    Explanations

    negations or contradictions in sentences

    negations or phrases that express what should not be done or considered

    New Auto-Interp
    Negative Logits
    tein
    -0.78
    velt
    -0.70
    WER
    -0.67
    LIN
    -0.67
    hower
    -0.66
    ulty
    -0.65
    LG
    -0.65
    rift
    -0.64
    weekly
    -0.64
    LU
    -0.62
    POSITIVE LOGITS
     necessarily
    1.36
    icably
    1.14
    epad
    1.10
    icable
    1.00
    etheless
    0.99
     bothering
    0.91
     adequately
    0.86
     bothered
    0.85
    eworthy
    0.84
    ifies
    0.82
    Act Density 0.048%

    No Known Activations