INDEX
Explanations
The neuron is triggered by floating‐point number tokens (decimal numerals).
New Auto-Interp
Negative Logits
eworthy
-0.07
に関
-0.06
funct
-0.06
�
-0.06
suitable
-0.06
hsi
-0.06
advant
-0.06
weekday
-0.06
ronym
-0.06
Reduc
-0.06
POSITIVE LOGITS
Around
0.08
over
0.08
tops
0.08
contour
0.07
↵
0.07
underwater
0.07
uit
0.07
everywhere
0.07
Down
0.07
contours
0.07
Activations Density 0.007%