INDEX
Explanations
The neuron fires on contracted and modal/hedging words (e.g. “It’s,” “Sorry,” “must,” “I’m,” “don’t”).
New Auto-Interp
Negative Logits
astro
-0.06
(Tree
-0.06
顔
-0.06
unseren
-0.06
.AppendFormat
-0.06
urent
-0.06
prerequisites
-0.06
flash
-0.06
536
-0.06
Towers
-0.06
POSITIVE LOGITS
الجن
0.08
UNION
0.07
检
0.07
.'
0.06
ka
0.06
nok
0.06
_nome
0.06
succesfully
0.06
�
0.06
hiểm
0.06
Activations Density 0.148%