INDEX
Explanations
words related to ingredients and cooking instructions, especially celery
references to celery within various contexts
New Auto-Interp
Negative Logits
ulated
-0.79
ulative
-0.77
eston
-0.75
PF
-0.74
ested
-0.69
ifier
-0.68
embed
-0.68
ers
-0.66
ulating
-0.64
Winged
-0.64
POSITIVE LOGITS
flies
0.93
llan
0.88
tery
0.86
nces
0.83
onica
0.76
ea
0.76
oire
0.75
yy
0.73
shelves
0.71
utical
0.71
Activations Density 0.061%