INDEX
Explanations
punctuation
This neuron fires on news‐style dateline and metadata tokens (e.g. city names with colons, parenthetical source tags, and date/time markers).
New Auto-Interp
Negative Logits
tx
-0.07
ような
-0.06
엔
-0.06
mocked
-0.06
ống
-0.06
iliz
-0.06
trad
-0.06
Typed
-0.06
ويل
-0.06
(active
-0.06
POSITIVE LOGITS
görev
0.07
Gre
0.07
DPR
0.06
面积
0.06
_SEGMENT
0.06
-*
0.06
wrath
0.06
sk
0.06
::|
0.06
consultancy
0.06
Activations Density 0.206%