INDEX
Explanations
mentions of the word "pizza" in various contexts
references to pizza
New Auto-Interp
Negative Logits
itud
-0.80
lasses
-0.79
hips
-0.78
uate
-0.77
Thom
-0.77
track
-0.71
uating
-0.71
ought
-0.68
ARB
-0.68
ility
-0.67
POSITIVE LOGITS
dough
1.04
crust
1.02
oven
0.96
Hut
0.92
Dough
0.90
pies
0.87
delivery
0.85
pizza
0.84
ocalypse
0.83
isine
0.83
Activations Density 0.028%