INDEX
Explanations
The neuron fires on terms that describe modifying a structure—especially adding or removing elements (e.g. “adding,” “removing,” “columns,” “properties”).
New Auto-Interp
Negative Logits
}, ↵
-0.08
OTHERWISE
-0.07
بعد
-0.07
FIX
-0.07
.Lock
-0.07
},↵↵↵
-0.07
hexatrigesimal
-0.07
acids
-0.06
��
-0.06
Titan
-0.06
POSITIVE LOGITS
activ
0.08
lone
0.06
Mystery
0.06
etine
0.06
::$
0.06
ूज
0.06
elop
0.06
Jur
0.06
-status
0.06
fant
0.06
Activations Density 0.010%