INDEX
    Explanations

    contrasting ideas or qualities in descriptions

    New Auto-Interp
    Negative Logits
    abor
    -0.16
     addCriterion
    -0.15
    ế
    -0.15
    ined
    -0.14
     leh
    -0.14
    ura
    -0.14
    andler
    -0.14
    ghan
    -0.14
    descending
    -0.14
     sanity
    -0.13
    POSITIVE LOGITS
    åį»
    0.17
    åį´
    0.16
    cket
    0.16
     nevertheless
    0.16
    éľŀ
    0.16
     enough
    0.15
    lobs
    0.15
    nier
    0.15
     Opposition
    0.14
    è¶³
    0.14
    Act Density 0.181%

    No Known Activations