INDEX
Explanations
The neuron fires on semantically rich, content‐bearing tokens (especially verbs and nouns) rather than common function words.
New Auto-Interp
Negative Logits
्यत
-0.08
names
-0.07
-only
-0.07
only
-0.07
submitted
-0.06
гаран
-0.06
nonnull
-0.06
surveys
-0.06
harvested
-0.06
(messages
-0.06
POSITIVE LOGITS
026
0.06
่านมา
0.06
ียว
0.06
household
0.06
tvoří
0.06
quoise
0.06
coma
0.06
이다
0.06
nhu
0.06
ulumi
0.06
Activations Density 0.256%