INDEX
Explanations
Brackets
This neuron activates on numeric tokens—especially digits and decimal numbers representing coefficients or matrix entries.
New Auto-Interp
Negative Logits
_START
-0.07
sauce
-0.07
buddies
-0.07
.floor
-0.07
IVERS
-0.07
EX
-0.07
ilerine
-0.07
ط
-0.07
ájem
-0.07
'})↵
-0.07
POSITIVE LOGITS
。</
0.06
adorned
0.06
Founded
0.06
проф
0.06
_sched
0.06
.'
0.06
.IntPtr
0.06
overhead
0.06
Paramount
0.05
_]
0.05
Activations Density 0.005%