INDEX
Explanations
journal publications
The neuron activates on tokens typical of scholarly citations—phrases introducing or referencing scientific studies (e.g. “According to a study published in the Journal of…”).
New Auto-Interp
Negative Logits
-Bar
-0.07
rin
-0.07
cef
-0.07
ø
-0.07
ufig
-0.07
clone
-0.06
apo
-0.06
Hunt
-0.06
àng
-0.06
enko
-0.06
POSITIVE LOGITS
favorite
0.07
�
0.07
ネ
0.06
Optionally
0.06
최신
0.06
zih
0.06
Million
0.06
acompan
0.06
clases
0.06
MVC
0.06
Activations Density 0.010%