INDEX
    Explanations

    expressions of happiness or positive emotions

    New Auto-Interp
    Negative Logits
    GenerationType
    -0.78
    र्भ
    -0.77
     كمان
    -0.74
    िल्
    -0.69
    hithe
    -0.68
     zagran
    -0.66
     Baz
    -0.63
    Thru
    -0.63
    -0.63
     Borges
    -0.62
    POSITIVE LOGITS
     happy
    1.45
     Happy
    1.37
    HAPPY
    1.36
     HAPPY
    1.36
     Happiness
    1.32
     happiness
    1.31
     happier
    1.27
    happy
    1.26
    happiness
    1.21
    Happiness
    1.20
    Act Density 0.029%

    No Known Activations