INDEX
    Explanations

    sentences containing contrasting statements or ideas

    the word "but" to indicate contrast or exception

    New Auto-Interp
    Negative Logits
    roy
    -0.85
    oys
    -0.75
    oun
    -0.71
    ump
    -0.69
    itto
    -0.69
    osite
    -0.68
    uddin
    -0.68
    ands
    -0.66
    oop
    -0.65
    asar
    -0.64
    POSITIVE LOGITS
    tons
    1.05
     alas
    1.04
     nevertheless
    1.03
     nonetheless
    0.99
     fortunately
    0.91
     luckily
    0.90
     beware
    0.83
     insofar
    0.82
     hey
    0.80
     unfortunately
    0.76
    Act Density 0.177%

    No Known Activations