INDEX
    Explanations

    phrases related to comparison or contrast

    references to the word "well."

    New Auto-Interp
    Negative Logits
    anos
    -0.78
    pid
    -0.65
    anon
    -0.64
    mare
    -0.62
    ierce
    -0.62
    anus
    -0.61
    zan
    -0.61
    absolute
    -0.61
    mid
    -0.60
    agara
    -0.60
    POSITIVE LOGITS
     evidenced
    0.61
     optionally
    0.60
    NESS
    0.60
    onse
    0.59
    umenthal
    0.58
    FTWARE
    0.58
     insofar
    0.57
    Label
    0.57
     vers
    0.57
     possibly
    0.56
    Act Density 0.030%

    No Known Activations