INDEX
Explanations
unexpected behavior
This neuron flags content‐bearing tokens (nouns, verbs, adjectives, punctuation, code identifiers, etc.) and stays off for common function words (the, and, to, etc.).
New Auto-Interp
Negative Logits
Ор
-0.06
aversal
-0.06
_with
-0.06
visual
-0.06
studi
-0.06
官网
-0.06
.retry
-0.06
Walt
-0.06
Schwe
-0.06
rewrite
-0.06
POSITIVE LOGITS
restricting
0.06
ALL
0.06
sprayed
0.06
MET
0.06
готов
0.06
neigh
0.06
continents
0.06
weekday
0.06
은
0.06
tow
0.06
Activations Density 0.067%