INDEX
Explanations
The neuron activates on the category lines at the ends of articles—especially numeric year tokens and words like “stations,” “opened,” or “establishments”—i.e. it detects category tags indicating the year something was established or opened.
New Auto-Interp
Negative Logits
ainter
-0.08
isted
-0.06
azo
-0.06
amaç
-0.06
ěli
-0.06
거래
-0.06
_codigo
-0.06
…)
-0.06
_des
-0.06
conforme
-0.06
POSITIVE LOGITS
도가
0.08
tyre
0.07
PTS
0.06
リス
0.06
NK
0.06
.locals
0.06
TJ
0.06
kamp
0.06
Cell
0.06
NSF
0.06
Activations Density 0.028%