INDEX
    Explanations

    phrases related to sarcasm or humorous exaggeration

    phrases and expressions of disbelief or sarcasm

    New Auto-Interp
    Negative Logits
    marked
    -0.95
    ŃĶ
    -0.79
    namese
    -0.72
    bill
    -0.70
    marks
    -0.69
    portion
    -0.68
    bred
    -0.67
    foreseen
    -0.66
    Root
    -0.65
    ravings
    -0.64
    POSITIVE LOGITS
     kidding
    1.50
     joking
    0.98
     aside
    0.89
    terday
    0.81
     spared
    0.79
     aloud
    0.75
    zzle
    0.67
     yourselves
    0.67
    olicy
    0.67
     Laugh
    0.65
    Act Density 0.009%

    No Known Activations