INDEX
    Explanations

    phrases related to contrasts or alternatives

    punctuation, specifically commas

    New Auto-Interp
    Negative Logits
     Slate
    -0.64
     Talks
    -0.61
    ÅĤ
    -0.60
     Theft
    -0.59
     Coverage
    -0.58
    olves
    -0.57
     CLR
    -0.55
    gow
    -0.54
     Others
    -0.54
     Documentation
    -0.54
    POSITIVE LOGITS
     alas
    0.86
     somew
    0.85
     respectively
    0.80
     albeit
    0.79
     depending
    0.78
     um
    0.75
     uh
    0.74
     unsurprisingly
    0.74
    女
    0.72
    according
    0.71
    Act Density 0.235%

    No Known Activations