INDEX
Explanations
references to cakes
references to cakes
New Auto-Interp
Negative Logits
nesota
-0.78
ostics
-0.77
ENTION
-0.76
audi
-0.74
iveness
-0.72
selves
-0.69
Lomb
-0.69
Cheong
-0.67
agonists
-0.65
Fargo
-0.64
POSITIVE LOGITS
cake
0.94
cakes
0.91
meal
0.90
cake
0.89
cakes
0.88
batter
0.87
walk
0.83
pillar
0.82
balls
0.79
fruit
0.78
Activations Density 0.018%