INDEX
Explanations
The neuron detects sentence‐initial temporal or discourse transition words (e.g. “Following,” “After,” “Based on,” etc.).
New Auto-Interp
Negative Logits
flour
-0.07
Fur
-0.07
Fel
-0.07
defends
-0.06
borrow
-0.06
Müller
-0.06
endir
-0.06
defended
-0.06
chrom
-0.06
apur
-0.06
POSITIVE LOGITS
.axis
0.07
/console
0.06
↵
0.06
(){
↵0.06
_checks
0.06
Predator
0.06
IEWS
0.05
Complete
0.05
стров
0.05
ราย
0.05
Activations Density 0.061%