INDEX
Explanations
This neuron activates on isolated capital‐letter initials followed by a period, i.e. author or editor initials in citations.
New Auto-Interp
Negative Logits
white
-0.07
_r
-0.07
,rp
-0.07
,如
-0.06
Categories
-0.06
_Draw
-0.06
trails
-0.06
Souls
-0.06
marsh
-0.06
Code
-0.06
POSITIVE LOGITS
.mozilla
0.07
ByteBuffer
0.06
shader
0.06
рож
0.06
Nous
0.06
λλι
0.06
ี.
0.06
skou
0.06
.SERVER
0.06
Pablo
0.06
Activations Density 0.014%