INDEX
Explanations
negation
The neuron broadly responds to medium-frequency content words (i.e. non-stopwords with real lexical meaning), regardless of part of speech.
New Auto-Interp
Negative Logits
MOOTH
-0.07
omm
-0.07
problematic
-0.07
Zy
-0.06
destino
-0.06
nap
-0.06
vul
-0.06
questo
-0.06
descri
-0.06
'aff
-0.06
POSITIVE LOGITS
dağ
0.07
_tcp
0.06
带
0.06
responder
0.06
kuruluş
0.06
+b
0.06
्षमत
0.06
đau
0.06
ляться
0.06
ोज
0.06
Activations Density 0.041%