INDEX
Explanations
This neuron specifically activates on the token “York,” most notably when it appears as part of “New York.”
New Auto-Interp
Negative Logits
tostring
-0.07
dda
-0.06
духов
-0.06
Init
-0.06
Camera
-0.06
(f
-0.06
Girlfriend
-0.06
Afterwards
-0.06
fetus
-0.06
IER
-0.06
POSITIVE LOGITS
York
0.17
YORK
0.10
York
0.10
Yorker
0.10
york
0.09
Yorkers
0.08
neurop
0.08
NY
0.08
are
0.08
Oregon
0.07
Activations Density 0.012%