INDEX
Explanations
mentions of food and beverages in various contexts
New Auto-Interp
Negative Logits
foods
-0.38
Foods
-0.38
Food
-0.35
FOOD
-0.34
foods
-0.33
food
-0.33
food
-0.32
Food
-0.32
-food
-0.31
_food
-0.29
POSITIVE LOGITS
drink
0.41
beverage
0.32
Drink
0.32
Bever
0.32
drinks
0.31
drink
0.30
beverages
0.30
bev
0.29
Beverage
0.28
Drink
0.27
Activations Density 0.047%