INDEX
Explanations
previously
This neuron detects novelty-claim phrases about something never existing or being said/seen before (e.g. “never been said before,” “never seen before”).
New Auto-Interp
Negative Logits
أف
-0.07
马
-0.07
sour
-0.06
NotificationCenter
-0.06
jab
-0.06
Nothing
-0.06
plugins
-0.06
_REQUIRED
-0.06
(buffer
-0.06
.gf
-0.06
POSITIVE LOGITS
annotated
0.07
�
0.07
mond
0.07
\<^
0.06
carcin
0.06
espec
0.06
할인
0.06
misogyn
0.06
_AspNet
0.06
Registers
0.06
Activations Density 0.020%