INDEX
Explanations
This neuron activates on occurrences of “torch.” API calls (i.e. tokens beginning with “torch.”).
New Auto-Interp
Negative Logits
.flatMap
-0.08
AREA
-0.08
препара
-0.07
amin
-0.07
ENDING
-0.07
.min
-0.07
-boot
-0.07
.Bytes
-0.07
Division
-0.07
quan
-0.06
POSITIVE LOGITS
torch
0.09
(torch
0.09
=torch
0.08
Torch
0.08
orch
0.07
Ritch
0.06
onte
0.06
torch
0.06
松
0.06
Scotch
0.06
Activations Density 0.003%