INDEX
    Explanations

    adverbs indicating likelihood or expectation

    phrases that assert expectations or recommendations

    New Auto-Interp
    Negative Logits
    CI
    -0.67
     Patty
    -0.62
     Afgh
    -0.57
     Rox
    -0.57
     Cir
    -0.56
     Mehran
    -0.56
    yss
    -0.54
     Ends
    -0.54
     Fra
    -0.54
    HER
    -0.54
    POSITIVE LOGITS
     ideally
    1.15
    ered
    1.11
     be
    1.09
    ering
    1.04
     suffice
    1.04
     theoretically
    0.93
    nt
    0.93
     NEVER
    0.89
     beware
    0.87
     definitely
    0.86
    Act Density 0.068%

    No Known Activations