INDEX
    Explanations

    phrases related to a contrast or contradiction

    phrases characterized by evaluative or opinionated expressions

    New Auto-Interp
    Negative Logits
    luaj
    -0.70
    ukong
    -0.62
     newsp
    -0.60
    dit
    -0.60
    Laughs
    -0.59
    ologically
    -0.58
    bies
    -0.58
     welf
    -0.56
    iyah
    -0.55
    Introduced
    -0.55
    POSITIVE LOGITS
     nonetheless
    1.29
     nevertheless
    1.22
     hardly
    1.04
     still
    1.03
     unlikely
    1.02
     undeniable
    1.00
     undeniably
    0.99
     doubtful
    0.97
     unclear
    0.97
     certainly
    0.94
    Act Density 0.172%

    No Known Activations