INDEX
Explanations
comparing two different things
The neuron detects the special placeholder tokens used to label characters (e.g. “NAME_2,” “NAME_3,” etc.).
New Auto-Interp
Negative Logits
mi
-0.08
desarroll
-0.07
.Return
-0.07
.Large
-0.07
-sided
-0.07
Manager
-0.07
polynomial
-0.06
YS
-0.06
раск
-0.06
PRIV
-0.06
POSITIVE LOGITS
iena
0.06
граду
0.06
,filename
0.06
(display
0.06
hardship
0.06
(*)(
0.06
esModule
0.06
.INTEGER
0.06
числе
0.06
(&_
0.05
Activations Density 0.014%