INDEX
Explanations
The neuron fires on the “Q:” and “A:” markers that delimit questions and answers in the post.
New Auto-Interp
Negative Logits
Wikispecies
-0.07
_MAGIC
-0.07
lixir
-0.07
lent
-0.07
Mant
-0.06
bo
-0.06
Tab
-0.06
prot
-0.06
rente
-0.06
lem
-0.06
POSITIVE LOGITS
自动
0.07
Years
0.07
CLUB
0.06
üy
0.06
RowBox
0.06
tickets
0.06
Ricky
0.06
_STYLE
0.06
CSV
0.06
Modified
0.06
Activations Density 0.002%