INDEX
Explanations
code punctuation
The neuron strongly activates on numeric literals (especially the “0” tokens) in the code.
New Auto-Interp
Negative Logits
Away
-0.07
>w
-0.07
〇
-0.06
.RightToLeft
-0.06
اختص
-0.06
Einsatz
-0.06
ALLED
-0.06
ліка
-0.06
/"↵↵
-0.06
ENCY
-0.06
POSITIVE LOGITS
zvyš
0.07
Fl
0.07
Graph
0.06
Swagger
0.06
{{$0.06
decreasing
0.06
Truth
0.06
는데
0.06
Argument
0.06
ncols
0.06
Activations Density 0.006%