INDEX
Explanations
elements of humor and jokes
New Auto-Interp
Negative Logits
istore
-0.55
oherty
-0.53
насеље
-0.53
обходи
-0.50
-0.50
setBlock
-0.49
Na
-0.48
cl
-0.48
poitrine
-0.48
straße
-0.48
POSITIVE LOGITS
joke
1.32
joking
1.29
Joke
1.29
jokes
1.21
Joke
1.18
joke
1.17
joked
1.10
Jokes
1.07
jokes
1.05
Jok
0.99
Activations Density 0.071%