INDEX
Explanations
This neuron activates on words signaling recent developments or ongoing innovation (e.g., “new,” “being,” “developed,” “introduced,” “evolving”).
New Auto-Interp
Negative Logits
resolve
-0.06
夫
-0.06
Fot
-0.06
υπηρε
-0.06
attack
-0.06
-0.06
Operand
-0.06
post
-0.06
Mash
-0.06
mime
-0.06
POSITIVE LOGITS
908
0.07
_AL
0.07
_MO
0.07
.NEW
0.07
ZONE
0.06
ğü
0.06
PU
0.06
aroo
0.06
_MODEL
0.06
ổi
0.06
Activations Density 0.035%