INDEX
    Explanations

    negations or expressions of denial

    New Auto-Interp
    Negative Logits
    rape
    -0.15
    bee
    -0.15
    anych
    -0.15
    nist
    -0.15
    encers
    -0.15
    uche
    -0.14
    ikh
    -0.14
    ième
    -0.14
    itters
    -0.14
    cul
    -0.14
    POSITIVE LOGITS
     matter
    0.53
    matter
    0.41
     Matter
    0.36
     doubt
    0.33
     wonder
    0.32
     amount
    0.27
     sooner
    0.24
     offense
    0.22
     mater
    0.22
     Doub
    0.22
    Act Density 0.040%

    No Known Activations