INDEX
Explanations
mentions and descriptions of breakfast items
mentions of breakfast
New Auto-Interp
Negative Logits
aft
-0.75
andro
-0.70
aws
-0.69
Engineers
-0.69
pmwiki
-0.65
Reviewer
-0.64
RH
-0.63
pg
-0.63
LESS
-0.62
oping
-0.62
POSITIVE LOGITS
breakfast
1.35
mornings
1.01
Breakfast
1.01
cereal
0.99
toast
0.94
lunch
0.94
halla
0.92
eteria
0.91
meals
0.89
brunch
0.88
Activations Density 0.007%