INDEX
Explanations
This neuron rarely (if ever) activates—it does not have any identifiable pattern it’s detecting.
New Auto-Interp
Negative Logits
recogn
-0.07
.Init
-0.07
рь
-0.07
ذار
-0.07
atform
-0.06
别
-0.06
ertest
-0.06
Probe
-0.06
Европ
-0.06
.Listen
-0.06
POSITIVE LOGITS
slopes
0.07
Stones
0.07
loadImage
0.06
mere
0.06
();?>
0.06
Weg
0.06
….
0.06
Türkçe
0.06
vyd
0.06
mens
0.06
Activations Density 0.015%