INDEX
Explanations
phrases related to food and beverages, particularly in social settings
New Auto-Interp
Negative Logits
foods
-0.34
Foods
-0.32
Food
-0.29
alimentos
-0.27
FOOD
-0.27
food
-0.27
Food
-0.27
comida
-0.26
foods
-0.26
food
-0.26
POSITIVE LOGITS
drink
0.40
drinks
0.34
Drink
0.33
Bever
0.33
beverage
0.32
beverages
0.32
Beverage
0.30
Drinks
0.29
drink
0.29
Drink
0.28
Activations Density 0.064%