INDEX
Explanations
instances of humor and puns in the text
New Auto-Interp
Negative Logits
ãģĵãĤĵãģ«
-0.15
caps
-0.15
inoa
-0.14
Slov
-0.14
avra
-0.14
usercontent
-0.14
:UIAlert
-0.13
unce
-0.13
oute
-0.13
ÅĽci
-0.13
POSITIVE LOGITS
ulp
0.17
kins
0.16
rung
0.15
Chaos
0.15
omet
0.15
elps
0.15
fraction
0.14
igers
0.14
iyi
0.14
erals
0.14
Activations Density 0.007%