INDEX
Explanations
mentions of pizza
references to pizza
New Auto-Interp
Negative Logits
Thom
-0.85
hips
-0.85
itud
-0.78
ivities
-0.74
track
-0.74
draw
-0.72
lasses
-0.71
Luther
-0.69
uate
-0.69
itudinal
-0.67
POSITIVE LOGITS
dough
1.15
oven
1.07
crust
1.03
ocalypse
0.94
pizza
0.91
Dough
0.89
pies
0.89
isine
0.85
delivery
0.84
pizz
0.83
Activations Density 0.014%