INDEX
Explanations
News articles
The neuron activates on proper names—distinct capitalized entities like people’s names, place names, or organization abbreviations.
New Auto-Interp
Negative Logits
UND
-0.07
VENTORY
-0.07
manipulate
-0.07
jury
-0.07
ING
-0.06
')↵↵
-0.06
↵
-0.06
-average
-0.06
ходить
-0.06
↵ ↵
-0.06
POSITIVE LOGITS
<dim
0.07
unny
0.07
arbe
0.06
zek
0.06
indispens
0.06
cuk
0.06
주소
0.06
stroy
0.06
.setForeground
0.06
boys
0.06
Activations Density 0.045%