INDEX
Explanations
instances of humor and irony in language
New Auto-Interp
Negative Logits
ajo
-0.17
ilk
-0.15
girls
-0.15
ÑģÑĤа
-0.15
Ñģм
-0.14
woff
-0.14
venta
-0.14
eya
-0.14
ķĮ
-0.13
ILA
-0.13
POSITIVE LOGITS
animate
0.17
.uniform
0.15
umpt
0.14
Ú¾
0.14
otron
0.14
IFORM
0.14
åĿĢ
0.14
_CBC
0.14
主
0.14
uniform
0.14
Activations Density 0.115%