INDEX
Explanations
words associated with humor or absurdity
New Auto-Interp
Negative Logits
ora
-0.17
esel
-0.15
udad
-0.15
avanaugh
-0.14
whelming
-0.14
å¥ĩ
-0.14
_MAJOR
-0.14
é«
-0.14
.neo
-0.14
endon
-0.14
POSITIVE LOGITS
erte
0.17
PIN
0.15
mach
0.15
squ
0.14
/no
0.14
amus
0.14
scheme
0.13
.jar
0.13
iar
0.13
Entr
0.13
Activations Density 0.029%