INDEX
Explanations
hamilton
This neuron primarily detects occurrences of the proper name “Hamilton.”
New Auto-Interp
Negative Logits
Red
-0.07
Prep
-0.07
DOC
-0.06
-Free
-0.06
-0.06
ears
-0.06
NEWS
-0.06
unread
-0.06
ق
-0.06
receives
-0.06
POSITIVE LOGITS
Hamilton
0.16
Hamilton
0.15
ーラ
0.08
飞
0.07
.flip
0.07
FlowLayout
0.07
handleSubmit
0.07
.viewModel
0.07
(_)
0.07
hait
0.07
Activations Density 0.001%