INDEX
Explanations
The neuron activates on mentions of “fallback” (or “fall back”) mechanisms or options in text or code.
New Auto-Interp
Negative Logits
_instructions
-0.08
Completion
-0.08
sorter
-0.08
Needle
-0.07
.groupby
-0.07
idian
-0.07
ption
-0.07
.AddParameter
-0.07
дво
-0.07
Surface
-0.07
POSITIVE LOGITS
fallback
0.10
fallback
0.07
milfs
0.07
主义
0.06
flatMap
0.06
handles
0.06
lando
0.06
-points
0.06
Fallout
0.06
식
0.06
Activations Density 0.005%