INDEX
Explanations
expressions of predictability and repetitiveness in humor and content
New Auto-Interp
Negative Logits
gid
-0.17
unlike
-0.15
sty
-0.14
ophy
-0.14
authService
-0.14
Sew
-0.14
seperate
-0.14
raph
-0.13
onaut
-0.13
Inline
-0.13
POSITIVE LOGITS
predictable
0.18
repet
0.18
repetition
0.18
repetitive
0.18
unins
0.18
predict
0.17
identical
0.17
affer
0.16
repetitions
0.15
generic
0.15
Activations Density 0.191%