INDEX
    Explanations

    references to popular reactions or opinions that indicate success or failure

    contrasting statements using "but" to introduce unexpected or opposing ideas.

    New Auto-Interp
    Negative Logits
     zwar
    -0.79
     sice
    -0.71
     dessutom
    -0.69
    更是
    -0.66
     nejen
    -0.63
     although
    -0.63
    雖然
    -0.62
     zudem
    -0.61
     even
    -0.60
     außerdem
    -0.59
    POSITIVE LOGITS
     nonetheless
    1.83
     nevertheless
    1.78
    Nonetheless
    1.27
    Nevertheless
    1.25
     Nonetheless
    1.25
     Nevertheless
    1.20
     dennoch
    1.13
     trotzdem
    1.12
    それでも
    1.09
     ändå
    1.09
    Act Density 0.585%

    No Known Activations