INDEX
    Explanations

    instances where the text contrasts a situation with potential negative implications against something else or raises concerns

    phrases indicating a contrast or comparison

    New Auto-Interp
    Negative Logits
    si
    -0.82
    RIP
    -0.79
    atos
    -0.76
    bowl
    -0.76
    itect
    -0.74
    pecially
    -0.72
    PI
    -0.70
    utm
    -0.68
    ().
    -0.68
    Its
    -0.68
    POSITIVE LOGITS
     nonetheless
    1.04
    etheless
    0.98
     undeniable
    0.80
     nevertheless
    0.76
     deeper
    0.75
    chers
    0.74
     curiously
    0.74
     undeniably
    0.72
     challeng
    0.72
     broader
    0.71
    Act Density 0.592%

    No Known Activations