INDEX
Explanations
The neuron consistently activates on the spelled-out number “one.”
New Auto-Interp
Negative Logits
//
-0.08
peanuts
-0.07
osals
-0.07
-solving
-0.06
風
-0.06
rup
-0.06
histoire
-0.06
farm
-0.06
_screen
-0.06
sts
-0.06
POSITIVE LOGITS
one
0.13
ONE
0.08
One
0.08
Organizer
0.07
.SerializeObject
0.06
overwritten
0.06
ones
0.06
單
0.06
each
0.06
jednoho
0.06
Activations Density 0.042%