INDEX
Explanations
references to humor and comedic elements
New Auto-Interp
Negative Logits
神宮
-0.58
}{|-0.54
deepEqual
-0.50
atsby
-0.49
kör
-0.48
GMENT
-0.48
த்
-0.47
спол
-0.47
küpe
-0.46
первых
-0.45
POSITIVE LOGITS
joke
2.05
jokes
2.03
humor
1.99
humour
1.92
Humor
1.92
comedy
1.90
humor
1.85
comedic
1.83
funny
1.82
Joke
1.80
Activations Density 0.131%