INDEX
Explanations
words related to happiness and unhappiness
expressions related to happiness and emotional well-being
New Auto-Interp
Negative Logits
sk
-0.78
Ry
-0.72
heit
-0.72
books
-0.71
Vs
-0.70
UAL
-0.67
adr
-0.66
Maps
-0.65
armor
-0.65
DoS
-0.65
POSITIVE LOGITS
happier
0.98
happiest
0.87
happy
0.82
unhappy
0.81
adolesc
0.80
ishers
0.73
mosqu
0.73
iated
0.72
experien
0.71
iliate
0.71
Activations Density 0.008%