INDEX
Explanations
words related to humor or situations perceived as funny
mentions of "joke"
New Auto-Interp
Negative Logits
undai
-0.77
actions
-0.72
phalt
-0.72
ioch
-0.70
porting
-0.70
ignt
-0.69
outheast
-0.68
ilings
-0.68
rights
-0.66
necess
-0.65
POSITIVE LOGITS
joke
1.00
ously
0.99
jokes
0.88
joking
0.82
Pom
0.81
banter
0.77
biz
0.77
okingly
0.74
osal
0.73
bags
0.73
Activations Density 0.011%