INDEX
    Explanations

    negative statements or scenarios

    negations and phrases indicating the absence or lack of something

    New Auto-Interp
    Negative Logits
    furt
    -0.84
    estamp
    -0.74
    ãĥĺ
    -0.72
    ĨĴ
    -0.71
    thood
    -0.70
    met
    -0.69
    wich
    -0.66
    û
    -0.64
    court
    -0.64
     filib
    -0.63
    POSITIVE LOGITS
    anan
    0.75
     already
    0.72
    inka
    0.64
    lamm
    0.62
     occasionally
    0.60
     plenty
    0.59
    ttes
    0.59
     shudder
    0.58
     also
    0.58
    RAG
    0.57
    Act Density 0.198%

    No Known Activations