INDEX
Explanations
This neuron activates on occurrences of the word “run,” particularly when it appears as a command or action in code or shell-instruction contexts.
New Auto-Interp
Negative Logits
theless
-0.08
delta
-0.06
Wide
-0.06
distorted
-0.06
edList
-0.06
iele
-0.06
rored
-0.06
EEE
-0.06
belt
-0.06
stare
-0.06
POSITIVE LOGITS
run
0.08
Run
0.08
n
0.08
_run
0.08
-use
0.07
-N
0.07
UN
0.07
conduct
0.07
Jon
0.07
ان
0.07
Activations Density 0.009%