INDEX
Explanations
movie credits
The neuron consistently activates on personal names (e.g., directors, actors, producers) in the text.
New Auto-Interp
Negative Logits
замет
-0.07
NONE
-0.06
namese
-0.06
ैश
-0.06
Kit
-0.06
nuclear
-0.06
любой
-0.06
CBC
-0.06
ीं,
-0.06
infiltration
-0.06
POSITIVE LOGITS
jose
0.06
weigh
0.06
nue
0.06
InnerHTML
0.06
discrim
0.06
炸
0.06
žení
0.06
فة
0.06
<↵
0.06
interested
0.06
Activations Density 0.032%