INDEX
    Explanations

    negations or expressions of doubt and uncertainty in the text

    New Auto-Interp
    Negative Logits
    uto
    -0.16
    åĨµ
    -0.15
    iges
    -0.15
    åħ¸
    -0.15
    _tt
    -0.15
    _DL
    -0.15
    XT
    -0.14
    inet
    -0.14
    ãĤ«ãĥĨ
    -0.14
    ran
    -0.14
    POSITIVE LOGITS
     much
    0.17
     minib
    0.17
     originally
    0.16
     anymore
    0.15
    much
    0.15
     anything
    0.15
     initially
    0.14
     exact
    0.14
     anywhere
    0.14
     Much
    0.14
    Act Density 0.166%

    No Known Activations