INDEX
Explanations
The neuron activates on tokens containing the letter sequence “men” (e.g. in names like Menal, Menhaden, Menken).
New Auto-Interp
Negative Logits
Arctic
-0.07
úc
-0.07
arc
-0.07
icro
-0.07
ARC
-0.07
Turbo
-0.07
Circular
-0.07
604
-0.07
RFC
-0.07
禮
-0.06
POSITIVE LOGITS
Man
0.09
don
0.09
/man
0.09
Mom
0.09
man
0.08
mn
0.08
Son
0.08
Man
0.07
.number
0.07
.Man
0.07
Activations Density 0.020%