INDEX
Explanations
The neuron fires on mentions of the word “Pizza” (or its close variants) in the text.
New Auto-Interp
Negative Logits
soul
-0.07
aul
-0.06
arou
-0.06
иком
-0.06
eBook
-0.06
cases
-0.06
<number
-0.06
categories
-0.06
late
-0.06
Discussion
-0.06
POSITIVE LOGITS
pizza
0.12
Pizza
0.11
Pizza
0.10
pizzas
0.08
pizza
0.08
tileSize
0.07
?.
0.07
izza
0.07
INA
0.07
�
0.07
Activations Density 0.004%