INDEX
Explanations
opinions, questions
The neuron fires on informal, vague quantifier phrases that begin general statements (e.g. “a lot of what’s…”).
New Auto-Interp
Negative Logits
ницы
-0.07
Gardner
-0.07
cott
-0.07
abl
-0.06
chooser
-0.06
Variant
-0.06
tomato
-0.06
_sc
-0.06
meld
-0.06
ْر
-0.06
POSITIVE LOGITS
ted
0.06
Фед
0.06
\App
0.06
аб
0.06
ishop
0.06
isk
0.06
_DIFF
0.06
매우
0.06
ReuseIdentifier
0.06
(err
0.06
Activations Density 0.062%