INDEX
    Explanations

    expressions of positive sentiment and appreciation

    New Auto-Interp
    Negative Logits
    azor
    -0.17
    estre
    -0.14
    ((-
    -0.14
    achten
    -0.14
    ugo
    -0.13
    indow
    -0.13
    æħ§
    -0.13
    à¥Ĥड
    -0.13
    ecom
    -0.13
     kys
    -0.13
    POSITIVE LOGITS
     nice
    0.35
     neat
    0.29
    nice
    0.28
     hum
    0.26
     cool
    0.26
    Nice
    0.25
     grat
    0.24
     special
    0.24
     Nice
    0.24
     great
    0.23
    Act Density 0.102%

    No Known Activations