INDEX
Explanations
This neuron reliably lights up on the very first content word of a new article or section (e.g. the opening “The,” “Museum,” “Population,” etc.), marking the start of a topic segment.
New Auto-Interp
Negative Logits
ega
-0.07
olvable
-0.06
िलत
-0.06
ати
-0.06
cka
-0.06
illon
-0.06
Cit
-0.06
setback
-0.06
िथ
-0.06
_ct
-0.06
POSITIVE LOGITS
Dire
0.07
conjug
0.07
rightfully
0.07
…”
0.06
/hash
0.06
003
0.06
.rs
0.06
boring
0.06
recv
0.06
topics
0.06
Activations Density 0.083%