INDEX
Explanations
instances of the word "pun" and its variations, indicating a focus on humor or wordplay
New Auto-Interp
Negative Logits
dz
-0.19
444
-0.17
pheres
-0.15
ãĤĦãģĻ
-0.15
uben
-0.15
ิà¸ĩห
-0.15
den
-0.14
ãĤĵãģ©
-0.14
irez
-0.14
adden
-0.14
POSITIVE LOGITS
jabi
0.34
ishment
0.26
isher
0.26
pun
0.25
ishments
0.24
intended
0.24
jab
0.24
Pun
0.21
ishing
0.21
pun
0.21
Activations Density 0.006%