INDEX
Explanations
Wikipedia
This neuron never activates on any content—it’s effectively “dead” and doesn’t detect any pattern in the text.
New Auto-Interp
Negative Logits
dealership
-0.07
dedicate
-0.07
Triangles
-0.06
DU
-0.06
Base
-0.06
rip
-0.06
-box
-0.06
-0.06
pit
-0.06
polygons
-0.06
POSITIVE LOGITS
does
0.07
üfus
0.07
.auth
0.06
Kč
0.06
ynı
0.06
'[
0.06
потріб
0.06
нен
0.06
_tid
0.06
діяль
0.06
Activations Density 0.009%