INDEX
Explanations
This neuron fires on isolated single‐character tokens—most often uppercase Latin letters or single Katakana characters.
New Auto-Interp
Negative Logits
زان
-0.07
Terminator
-0.07
榜
-0.07
ramid
-0.06
사는
-0.06
Pyramid
-0.06
Tourism
-0.06
antine
-0.06
Gün
-0.06
magazines
-0.06
POSITIVE LOGITS
.raises
0.07
DBG
0.07
grad
0.06
Gui
0.06
�
0.06
/')↵
0.06
Dao
0.06
<bool
0.06
(results
0.06
外部リンク
0.06
Activations Density 0.249%