INDEX
Explanations
references to humor and comedy
New Auto-Interp
Negative Logits
enie
-0.18
855
-0.18
ffen
-0.17
ed
-0.16
chl
-0.16
humorous
-0.16
ivr
-0.16
holder
-0.16
edb
-0.15
854
-0.15
POSITIVE LOGITS
bone
0.23
erals
0.19
relief
0.19
Relief
0.19
bone
0.19
Bone
0.17
ously
0.17
Bone
0.17
Bones
0.17
bones
0.16
Activations Density 0.033%