INDEX
    Explanations

    negative constructs or words indicating lack and avoidance

    New Auto-Interp
    Negative Logits
     Deal
    -0.15
    mey
    -0.14
     Ell
    -0.14
    kate
    -0.14
    borough
    -0.14
    irl
    -0.14
    .grp
    -0.14
    phin
    -0.13
    irim
    -0.13
    andest
    -0.13
    POSITIVE LOGITS
    ADVERTISEMENT
    0.15
    ÙİÙĤ
    0.15
     Jaw
    0.15
    /format
    0.15
     ruce
    0.15
    imes
    0.15
     ner
    0.14
    ovaly
    0.14
     Wert
    0.14
    ILON
    0.14
    Act Density 0.000%

    No Known Activations