INDEX
    Explanations

    expressions of gratitude and positive sentiments in conversations

    New Auto-Interp
    Negative Logits
    boo
    -0.16
    ie
    -0.16
    ivia
    -0.15
    uum
    -0.15
     bore
    -0.14
    ides
    -0.14
    _sound
    -0.14
    ulares
    -0.14
    ombre
    -0.14
    ces
    -0.13
    POSITIVE LOGITS
     pleasure
    0.24
     nice
    0.20
    Nice
    0.19
     privilege
    0.18
     nic
    0.18
     Nice
    0.17
     Ple
    0.17
    nice
    0.17
     prive
    0.17
    #ad
    0.17
    Act Density 0.128%

    No Known Activations