INDEX
Explanations
words related to excitement and positivity
New Auto-Interp
Negative Logits
soever
-0.86
ILCS
-0.84
sequently
-0.73
udeau
-0.73
rouse
-0.72
atform
-0.71
ilst
-0.69
cific
-0.69
istrates
-0.69
enture
-0.68
POSITIVE LOGITS
gonna
1.31
alright
1.05
funny
1.05
awesome
1.03
awfully
0.99
hilarious
0.99
cute
0.98
fuckin
0.97
kinda
0.97
fucked
0.97
Activations Density 0.331%