INDEX
Explanations
This neuron detects in‐text citation markers (in particular the “pone” tokens within reference tags).
New Auto-Interp
Negative Logits
LF
-0.07
.monitor
-0.07
.Values
-0.07
範
-0.06
iêm
-0.06
_CI
-0.06
_rs
-0.06
shows
-0.06
(ar
-0.06
قف
-0.06
POSITIVE LOGITS
Rename
0.07
stata
0.06
-filled
0.06
ongoing
0.06
valleys
0.06
lives
0.06
والتي
0.06
writeTo
0.06
sme
0.06
hashes
0.06
Activations Density 0.002%