INDEX
Explanations
references to specific types of food and dietary choices
New Auto-Interp
Negative Logits
erty
-0.17
kova
-0.17
ively
-0.17
artial
-0.16
urn
-0.16
esc
-0.15
ovice
-0.15
ught
-0.15
AM
-0.15
urch
-0.15
POSITIVE LOGITS
bone
0.22
pool
0.20
bohydr
0.19
rying
0.18
riages
0.18
ibbean
0.17
ilage
0.17
oten
0.17
thers
0.17
-Semit
0.17
Activations Density 0.053%