INDEX
Explanations
The neuron activates on mentions of “podcast” (and its variants like “podcasts” or “podcasting”).
New Auto-Interp
Negative Logits
sep
-0.08
Ar
-0.06
胞
-0.06
_CLIP
-0.06
火
-0.06
Row
-0.06
बर
-0.06
Joy
-0.06
geometry
-0.06
bree
-0.06
POSITIVE LOGITS
podcast
0.14
Podcast
0.12
podcasts
0.10
odcast
0.09
cast
0.08
icast
0.07
aney
0.07
casts
0.07
milfs
0.07
journalist
0.07
Activations Density 0.004%