INDEX
Explanations
The neuron fires on code‐style tokens—i.e. programming‐language syntax (like braces, keywords, and backticks) rather than ordinary prose.
New Auto-Interp
Negative Logits
Zero
-0.07
uela
-0.07
DeV
-0.07
Saudi
-0.06
đoạn
-0.06
Matches
-0.06
oplan
-0.06
sexy
-0.06
numero
-0.06
152
-0.06
POSITIVE LOGITS
게시
0.07
ReceiveMemoryWarning
0.06
rum
0.06
остан
0.06
-:
0.05
silicone
0.05
='{$0.05
Nota
0.05
은
0.05
_stub
0.05
Activations Density 0.039%