INDEX
Explanations
phrases related to sarcasm or humorous exaggeration
phrases and expressions of disbelief or sarcasm
New Auto-Interp
Negative Logits
marked
-0.95
ŃĶ
-0.79
namese
-0.72
bill
-0.70
marks
-0.69
portion
-0.68
bred
-0.67
foreseen
-0.66
Root
-0.65
ravings
-0.64
POSITIVE LOGITS
kidding
1.50
joking
0.98
aside
0.89
terday
0.81
spared
0.79
aloud
0.75
zzle
0.67
yourselves
0.67
olicy
0.67
Laugh
0.65
Activations Density 0.009%