INDEX
Explanations
This neuron never activates—it’s effectively “dead” and doesn’t respond to any token.
New Auto-Interp
Negative Logits
зна
-0.07
्स
-0.06
dissertation
-0.06
دانش
-0.06
xima
-0.06
대학교
-0.06
IBM
-0.06
Maryland
-0.06
surv
-0.06
urd
-0.06
POSITIVE LOGITS
Range
0.08
additive
0.07
(Collection
0.07
--↵
0.07
nickname
0.07
(man
0.07
-out
0.07
-buffer
0.06
Array
0.06
__(/*!
0.06
Activations Density 0.007%