INDEX
    Explanations

    phrases related to comparison or contrast

    instances of the word "and" or similar conjunctions in lists

    New Auto-Interp
    Negative Logits
    cies
    -0.68
    itionally
    -0.68
    itions
    -0.67
    itives
    -0.66
    orer
    -0.65
    que
    -0.65
    hens
    -0.64
    coat
    -0.63
    enta
    -0.62
    eed
    -0.62
    POSITIVE LOGITS
    rongh
    0.65
     somew
    0.62
     recomm
    0.59
     diminishing
    0.59
     Aph
    0.59
     Ares
    0.58
    uana
    0.58
     disqual
    0.58
     grav
    0.57
    accompanied
    0.57
    Act Density 0.206%

    No Known Activations