INDEX
Explanations
multilingual curiosities
list-style structure and outline formatting, especially numbered “top 10” items and section headings.
The neuron is detecting tokens that represent numeric quantities—especially list‐size numbers or other multi‐digit numerals.
New Auto-Interp
Negative Logits
TCM
0.11
ṣe
0.11
או
0.11
および
0.11
profiles
0.11
postup
0.10
5
0.10
MT
0.10
profiles
0.10
ও
0.10
POSITIVE LOGITS
现象
0.11
infallible
0.11
言っ
0.11
ignorant
0.10
manures
0.10
♂️
0.10
वाक्य
0.10
voisinage
0.10
esist
0.10
ยนต์
0.10
Activations Density 18.376%