INDEX
    Explanations

    phrases where negation is involved

    negative contractions suggesting inability or negation

    New Auto-Interp
    Negative Logits
    catentry
    -0.81
    Offline
    -0.78
    ewater
    -0.72
    å
    -0.70
    orst
    -0.69
    iosyncr
    -0.68
    ãĥ¼ãĥĨãĤ£
    -0.67
    çͰ
    -0.64
    ocr
    -0.64
     Ãľ
    -0.63
    POSITIVE LOGITS
     temptation
    0.77
     anymore
    0.64
     sight
    0.61
    amaz
    0.61
    FN
    0.58
     sweets
    0.57
     Bronx
    0.56
     grinning
    0.56
    INS
    0.56
    RAG
    0.56
    Act Density 0.522%

    No Known Activations