INDEX
Explanations
Labrador
This neuron activates on longer, domain-specific nouns—especially proper names or technical terms.
New Auto-Interp
Negative Logits
lever
-0.06
weak
-0.06
Work
-0.06
"In
-0.06
Kid
-0.06
looks
-0.06
ico
-0.06
pancreatic
-0.06
ався
-0.06
output
-0.06
POSITIVE LOGITS
nak
0.07
Му
0.07
aşam
0.07
pmat
0.07
bury
0.07
.CompareTo
0.06
nez
0.06
播
0.06
ЛО
0.06
husus
0.06
Activations Density 0.358%