INDEX
    Explanations

    expressions of confusion or frustration in conversations

    New Auto-Interp
    Negative Logits
    ha
    -0.15
    ù
    -0.14
     facult
    -0.14
     (?)
    -0.14
    591
    -0.14
     Ramsey
    -0.13
    amd
    -0.13
     Gale
    -0.13
     prox
    -0.13
    ham
    -0.13
    POSITIVE LOGITS
     shit
    0.59
     crap
    0.54
    shit
    0.48
    crap
    0.43
     garbage
    0.42
     rubbish
    0.40
     BS
    0.39
     sh
    0.35
     junk
    0.35
     nonsense
    0.35
    Act Density 0.246%

    No Known Activations