INDEX
Explanations
flags and symbols
The neuron primarily activates on numerical tokens (e.g. years, centuries, and other numeral-based references).
New Auto-Interp
Negative Logits
.Invariant
-0.07
itest
-0.06
Operator
-0.06
搬
-0.06
ecology
-0.06
esco
-0.06
Hide
-0.06
seize
-0.06
Execute
-0.06
았다
-0.06
POSITIVE LOGITS
'LBL
0.07
(resp
0.06
カテ
0.06
otel
0.06
потрап
0.06
Returns
0.06
ure
0.06
jLabel
0.06
投稿
0.06
.showToast
0.06
Activations Density 0.019%