INDEX
Explanations
This neuron detects parenthetical review citations (phrases like “(reviewed in …)”).
New Auto-Interp
Negative Logits
잖
-0.07
Marino
-0.07
posX
-0.07
Chow
-0.07
ldkf
-0.07
месяца
-0.06
603
-0.06
відмов
-0.06
BJECT
-0.06
Ronaldo
-0.06
POSITIVE LOGITS
而
0.07
byter
0.07
ayr
0.07
(line
0.06
/><
0.06
cash
0.06
WELL
0.06
Народ
0.06
Fact
0.06
high
0.06
Activations Density 0.007%