INDEX
    Explanations

    phrases that indicate contrast or contradiction in arguments

    New Auto-Interp
    Negative Logits
     zwar
    -0.98
     sice
    -0.91
    basicConfig
    -0.81
     Mentre
    -0.76
    tvguidetime
    -0.73
     although
    -0.72
    while
    -0.72
    alors
    -0.72
     vaikka
    -0.70
    although
    -0.69
    POSITIVE LOGITS
     nonetheless
    1.24
     nevertheless
    1.19
     dennoch
    1.06
    それでも
    0.98
     there
    0.83
     Nonetheless
    0.82
     still
    0.82
    Nonetheless
    0.80
     ultimately
    0.79
     trotzdem
    0.79
    Act Density 0.198%

    No Known Activations