INDEX
Explanations
references to specific types of food or culinary items
New Auto-Interp
Negative Logits
ãĥ
-0.17
inson
-0.17
pars
-0.17
a
-0.16
p
-0.16
ested
-0.16
b
-0.15
lected
-0.15
brahim
-0.15
analy
-0.15
POSITIVE LOGITS
بÙĪÙĦ
0.20
azing
0.18
OUNT
0.18
pton
0.18
nesty
0.18
ERICAN
0.18
plitude
0.17
ycin
0.17
bling
0.17
orph
0.16
Activations Density 0.039%