INDEX
Explanations
references to comedic or humorous elements, particularly in relation to "punchlines."
New Auto-Interp
Negative Logits
usk
-0.16
income
-0.15
æ®Ĭ
-0.14
bero
-0.14
.cms
-0.14
mind
-0.14
Rub
-0.14
ollo
-0.14
engo
-0.14
rub
-0.14
POSITIVE LOGITS
punch
0.26
y
0.24
punching
0.23
Punch
0.22
eson
0.21
holes
0.21
punched
0.21
arella
0.20
er
0.20
-hole
0.20
Activations Density 0.009%