INDEX
Explanations
words related to the concept of "pun" or humorous wordplay
New Auto-Interp
Negative Logits
dz
-0.18
endez
-0.16
ager
-0.16
irez
-0.15
endant
-0.15
pheres
-0.15
mitter
-0.14
ez
-0.14
stown
-0.14
tü
-0.14
POSITIVE LOGITS
jabi
0.31
pun
0.30
ishment
0.26
pun
0.26
Pun
0.25
jab
0.24
isher
0.22
nett
0.21
ned
0.21
intended
0.21
Activations Density 0.007%