INDEX
Explanations
This neuron activates on occurrences of the proper name “William.”
New Auto-Interp
Negative Logits
in
-0.07
IN
-0.07
SO
-0.07
ein
-0.07
oo
-0.06
1
-0.06
Shia
-0.06
CN
-0.06
_Real
-0.06
oin
-0.06
POSITIVE LOGITS
James
0.12
James
0.11
Frederick
0.10
Charles
0.09
Lawrence
0.09
William
0.09
Charles
0.09
AMES
0.09
William
0.09
Samuel
0.09
Activations Density 0.075%