INDEX
    Explanations

    negations and words that imply exclusion or limitation in statements

    New Auto-Interp
    Negative Logits
    oux
    -0.16
    luv
    -0.16
    599
    -0.15
    urd
    -0.15
    anel
    -0.14
    ndon
    -0.14
    imizer
    -0.14
    ĺ认
    -0.14
    constitutional
    -0.14
    ÙĦÙĪ
    -0.14
    POSITIVE LOGITS
     afraid
    0.22
     judgment
    0.21
     necessarily
    0.21
     judgement
    0.20
     rein
    0.19
     conform
    0.19
     limit
    0.18
     just
    0.18
     conventional
    0.18
     merely
    0.18
    Act Density 0.244%

    No Known Activations