INDEX
    Explanations

    phrases indicating a contrast or alternative

    expressions that contrast or clarify ideas, often using the word "rather."

    New Auto-Interp
    Negative Logits
    uay
    -0.91
    amba
    -0.84
    adium
    -0.80
     Yard
    -0.75
    ocaust
    -0.74
    aido
    -0.73
    iens
    -0.71
    arent
    -0.71
    ilty
    -0.69
    rival
    -0.67
    POSITIVE LOGITS
     than
    0.77
     interestingly
    0.67
     informative
    0.66
     differentiate
    0.66
    FTWARE
    0.65
     amusing
    0.64
     distinguish
    0.63
     conservatism
    0.62
     akin
    0.61
     tame
    0.60
    Act Density 0.014%

    No Known Activations