INDEX
Explanations
This neuron fires on relatively rare or domain-specific content words (especially longer, technical or specialized terms), distinguishing them from common function words.
New Auto-Interp
Negative Logits
ategor
-0.08
emplo
-0.07
iterative
-0.07
nj
-0.07
J
-0.07
مب
-0.07
गय
-0.07
reas
-0.07
.lifecycle
-0.06
consumption
-0.06
POSITIVE LOGITS
Visit
0.07
oron
0.06
principalColumn
0.06
Solar
0.06
aspire
0.06
UTE
0.06
.requests
0.06
.mContext
0.06
TRACE
0.06
قرن
0.06
Activations Density 0.333%