INDEX
Explanations
The neuron detects mentions of the digit “4,” particularly as part of references to “GPT-4.”
New Auto-Interp
Negative Logits
Techniques
-0.06
lest
-0.06
rhs
-0.06
Patients
-0.06
patients
-0.06
ën
-0.06
_bb
-0.06
Queen
-0.06
(line
-0.06
Initializer
-0.06
POSITIVE LOGITS
momentos
0.08
mountains
0.07
chuck
0.07
ràng
0.06
哪
0.06
Shaman
0.06
neas
0.06
modne
0.06
isha
0.06
-nav
0.06
Activations Density 0.002%