INDEX
    Explanations

    conjunctions followed by contrasting statements

    New Auto-Interp
    Negative Logits
    irms
    -0.69
    dayName
    -0.66
    icated
    -0.65
    ":["
    -0.64
    irm
    -0.64
    rouse
    -0.63
    ses
    -0.63
    berries
    -0.62
    etermined
    -0.62
    Requires
    -0.62
    POSITIVE LOGITS
     alas
    0.98
    withstanding
    0.92
    owsky
    0.90
    tons
    0.87
    romeda
    0.85
     wait
    0.84
     hey
    0.79
     beware
    0.78
     suppose
    0.77
     yeah
    0.77
    Act Density 0.204%

    No Known Activations