INDEX
Explanations
blog posts
This neuron selectively responds to content‐bearing words—especially nonfunction tokens like nouns, adjectives, and other semantically heavy terms—while ignoring common grammatical words.
New Auto-Interp
Negative Logits
cc
-0.07
�
-0.06
475
-0.06
สถาน
-0.06
Priority
-0.06
Spencer
-0.06
469
-0.05
组织
-0.05
meye
-0.05
EventData
-0.05
POSITIVE LOGITS
delightful
0.07
رير
0.07
redistribute
0.07
.mobile
0.06
Want
0.06
breadcrumb
0.06
REATE
0.06
mobile
0.06
važ
0.06
.depend
0.06
Activations Density 0.113%