INDEX
Explanations
Asterisks and number signs
The neuron detects the leading asterisk “*” markers used in block-comment license headers.
New Auto-Interp
Negative Logits
Riverside
-0.07
PLEMENT
-0.06
fame
-0.06
práv
-0.06
fandom
-0.06
_mac
-0.06
_ALPHA
-0.06
tgt
-0.06
shady
-0.06
.pag
-0.06
POSITIVE LOGITS
triggering
0.06
Cyan
0.06
였다
0.06
Naturally
0.06
*******/↵↵
0.06
yük
0.06
若
0.06
odí
0.06
كه
0.06
↵ ↵
0.06
Activations Density 0.037%