INDEX
Explanations
advertisements
The neuron fires on the first significant content word at the start of a new sentence (i.e. it marks sentence beginnings).
New Auto-Interp
Negative Logits
摇
-0.07
п
-0.07
測
-0.06
pled
-0.06
増
-0.06
rowser
-0.06
-outline
-0.06
�
-0.06
gulp
-0.06
іту
-0.06
POSITIVE LOGITS
Careers
0.07
acea
0.07
Stuart
0.06
RELATED
0.06
Pers
0.06
lovely
0.06
legis
0.06
Dr
0.06
LIS
0.06
)';↵
0.06
Activations Density 0.017%