INDEX
Explanations
expressions of happiness or positive emotions
New Auto-Interp
Negative Logits
GenerationType
-0.78
र्भ
-0.77
كمان
-0.74
िल्
-0.69
hithe
-0.68
zagran
-0.66
Baz
-0.63
Thru
-0.63
氓
-0.63
Borges
-0.62
POSITIVE LOGITS
happy
1.45
Happy
1.37
HAPPY
1.36
HAPPY
1.36
Happiness
1.32
happiness
1.31
happier
1.27
happy
1.26
happiness
1.21
Happiness
1.20
Activations Density 0.029%