INDEX
Explanations
This neuron selectively activates on the preposition “on” in scientific‐style titles of the form “Effect of X on Y.”
New Auto-Interp
Negative Logits
保護
-0.08
harvested
-0.06
.enterprise
-0.06
abling
-0.06
отношения
-0.06
玩
-0.06
.addTo
-0.06
surprised
-0.06
finds
-0.06
полі
-0.06
POSITIVE LOGITS
pgsql
0.07
[...]↵↵
0.07
ственная
0.06
GX
0.06
ncy
0.06
egy
0.06
φ
0.06
tempt
0.06
�
0.06
ward
0.06
Activations Density 0.027%