INDEX
Explanations
references to cakes and baking
New Auto-Interp
Negative Logits
Soup
-0.18
chili
-0.18
soup
-0.18
soup
-0.17
Soup
-0.16
_ENDPOINT
-0.16
Chili
-0.15
족
-0.15
Bent
-0.15
ettle
-0.14
POSITIVE LOGITS
cake
0.35
cakes
0.32
cake
0.31
Cake
0.30
Cake
0.26
cakes
0.26
cupcakes
0.26
baker
0.25
batter
0.24
Baker
0.24
Activations Density 0.041%