INDEX
Explanations
The main thing this neuron does is detect arithmetic expressions or instructions (e.g. words like “divide,” “calculate,” “times,” “minus,” etc., together with numbers).
New Auto-Interp
Negative Logits
urges
-0.08
easiest
-0.07
budgets
-0.06
IRS
-0.06
synchronization
-0.06
.crop
-0.06
.result
-0.06
.CLASS
-0.06
learn
-0.06
33
-0.06
POSITIVE LOGITS
\Events
0.07
_inp
0.07
orgot
0.07
’da
0.07
quốc
0.07
Civil
0.07
่ก
0.07
→↵↵
0.07
Correo
0.07
чення
0.07
Activations Density 0.021%