INDEX
Explanations
The neuron activates on the token “other,” effectively spotting occurrences of that word.
New Auto-Interp
Negative Logits
adal
-0.07
-ups
-0.07
-up
-0.07
приєм
-0.07
the
-0.06
NOP
-0.06
assertSame
-0.06
himself
-0.06
schemas
-0.06
温
-0.06
POSITIVE LOGITS
itunes
0.07
.food
0.06
aliqua
0.06
(accounts
0.06
quý
0.06
корот
0.06
ITEM
0.06
اصلی
0.06
页面
0.06
OPT
0.06
Activations Density 0.091%