INDEX
Explanations
avoiding detail or repetition
The neuron activates on first‐person discourse markers or hedging/transition phrases (e.g. “I won’t go into…,” “I only want to add…”).
New Auto-Interp
Negative Logits
Chrom
-0.07
export
-0.07
فراهم
-0.07
emails
-0.06
尼亚
-0.06
.bits
-0.06
.Match
-0.06
churches
-0.06
ив
-0.06
bra
-0.06
POSITIVE LOGITS
_lvl
0.07
("/{0.06
=tk
0.06
""),↵
0.06
dataType
0.06
/al
0.06
.sale
0.06
ी)
0.06
ogh
0.06
>>>(
0.06
Activations Density 0.013%