INDEX
Explanations
mentions of food or dining-related words
words related to food, specifically types or categories of food
New Auto-Interp
Negative Logits
ness
-0.87
nesses
-0.78
dale
-0.76
endon
-0.75
igans
-0.74
sung
-0.72
ged
-0.72
aine
-0.70
ened
-0.68
ingen
-0.68
POSITIVE LOGITS
chnology
1.19
llular
1.02
rers
0.89
ctic
0.78
anu
0.76
lli
0.74
lled
0.71
nucleus
0.70
cki
0.70
ptive
0.66
Activations Density 0.128%