INDEX
    Explanations

    greetings or welcome messages in text

    references to groups of people in a conversational context

    New Auto-Interp
    Negative Logits
     territ
    -0.69
    sole
    -0.69
    adem
    -0.57
     Ukrain
    -0.56
    mination
    -0.54
     tenant
    -0.53
    ourses
    -0.53
    uphem
    -0.52
    Dialog
    -0.52
     occupies
    -0.52
    POSITIVE LOGITS
    !
    0.90
     :)
    0.81
    !!!!
    0.80
    opausal
    0.79
    !!
    0.79
    !!!
    0.78
     alike
    0.77
    !:
    0.77
     ðŁĻĤ
    0.75
     :-)
    0.75
    Act Density 0.079%

    No Known Activations