INDEX
Explanations
references to cakes, desserts, and celebratory occasions
New Auto-Interp
Negative Logits
nesota
-0.84
ually
-0.81
OSE
-0.68
Lomb
-0.68
iveness
-0.68
selves
-0.67
ostics
-0.67
ued
-0.65
GV
-0.65
audi
-0.64
POSITIVE LOGITS
batter
0.95
cake
0.91
cake
0.91
cakes
0.90
meal
0.90
dough
0.88
cakes
0.86
pillar
0.86
recipe
0.83
cup
0.81
Activations Density 0.055%