INDEX
    Explanations

    the word "not" followed by a strong emphasis on the subsequent word or phrase

    negations or phrases indicating refusal

    New Auto-Interp
    Negative Logits
    å¥
    -0.85
    ounters
    -0.73
    cano
    -0.69
    æ©
    -0.67
    ãĤ¼
    -0.65
     Hots
    -0.65
    ously
    -0.65
     peaks
    -0.64
    oise
    -0.63
    çļ
    -0.63
    POSITIVE LOGITS
     bothered
    1.29
     ashamed
    1.25
     afraid
    1.24
     interested
    1.22
     gonna
    1.21
     kidding
    1.21
     aware
    1.10
     necessarily
    1.10
     fooled
    1.06
     worried
    1.05
    Act Density 0.112%

    No Known Activations