INDEX
Explanations
mentions of meals or food-related activities
instances of the word "meal."
New Auto-Interp
Negative Logits
aft
-0.78
vernment
-0.68
hips
-0.65
founded
-0.63
itars
-0.61
shr
-0.60
inates
-0.60
herty
-0.60
Clifford
-0.59
idency
-0.59
POSITIVE LOGITS
meal
1.47
worms
1.26
Meal
1.26
meals
1.22
worm
0.95
meal
0.92
eaten
0.90
foods
0.85
oleon
0.84
dinners
0.79
Activations Density 0.006%