INDEX
    Explanations

    phrases expressing contrasting or opposing ideas

    conjunctions, particularly the word "but" in various contexts

    New Auto-Interp
    Negative Logits
    agra
    -0.70
    tnc
    -0.67
    itto
    -0.65
    entry
    -0.60
    ober
    -0.59
    oin
    -0.59
    built
    -0.59
    ription
    -0.59
    rongh
    -0.59
    edu
    -0.59
    POSITIVE LOGITS
    tons
    0.99
     alas
    0.94
    chery
    0.89
     nevertheless
    0.88
     nonetheless
    0.83
     fortunately
    0.77
     luckily
    0.77
    chers
    0.74
     hey
    0.73
     unfortunately
    0.73
    Act Density 0.148%

    No Known Activations