INDEX
Explanations
code/programming
The neuron activates on common function words—especially auxiliary and modal verbs (e.g. “that,” “been,” “will,” “be,” “don’t,” “think,” “im”)—rather than on content words or code tokens.
New Auto-Interp
Negative Logits
cassette
-0.07
Goose
-0.07
_uniform
-0.06
Username
-0.06
-up
-0.06
�江
-0.06
přev
-0.06
_curve
-0.06
bsolute
-0.06
Khan
-0.06
POSITIVE LOGITS
"]:↵
0.06
investigative
0.06
/> ↵
0.06
InView
0.06
careless
0.06
اختل
0.06
]+"
0.05
ObjectContext
0.05
differ
0.05
));↵↵↵
0.05
Activations Density 0.162%