INDEX
Explanations
code snippets
The neuron activates on numeric literal tokens (especially floating-point numbers).
New Auto-Interp
Negative Logits
mill
-0.07
Purpose
-0.06
_con
-0.06
孩子
-0.06
Recognition
-0.06
〃
-0.06
[:
-0.06
Funk
-0.06
Trilogy
-0.06
therapy
-0.06
POSITIVE LOGITS
цін
0.07
_TD
0.07
uncomment
0.07
τό
0.06
.form
0.06
adı
0.06
бенз
0.06
elsif
0.06
backward
0.06
……」↵↵
0.06
Activations Density 0.189%