INDEX
Explanations
phrases or discussions related to food and recipes
New Auto-Interp
Negative Logits
ļéĨĴ
-0.77
][
-0.70
istar
-0.70
Originally
-0.69
terday
-0.69
responsible
-0.68
arious
-0.65
incial
-0.62
uper
-0.61
Iraq
-0.61
POSITIVE LOGITS
reap
1.18
enjoy
1.16
preferably
1.04
beware
1.03
congr
1.00
bask
1.00
then
0.98
yourself
0.97
THEN
0.95
cknow
0.95
Activations Density 0.115%