INDEX
Explanations
references to different types of cake
references to cake
New Auto-Interp
Negative Logits
nesota
-0.79
ENTION
-0.72
selves
-0.72
iveness
-0.70
ually
-0.69
audi
-0.68
Lomb
-0.67
ostics
-0.66
Fargo
-0.63
nen
-0.63
POSITIVE LOGITS
cakes
0.99
cake
0.97
cake
0.94
cakes
0.91
meal
0.90
batter
0.83
walk
0.82
pillar
0.79
fruit
0.79
ĸļ
0.76
Activations Density 0.022%