INDEX
    Explanations

    The neuron primarily fires on the word “specific,” especially when it appears in the phrase “specific behavior.”

    New Auto-Interp
    Negative Logits
     studi
    -0.07
    aniu
    -0.07
     فن
    -0.06
     mỗi
    -0.06
     hott
    -0.06
    请输入
    -0.06
    ünüz
    -0.06
    งเศ
    -0.06
     conquest
    -0.06
        	 
    -0.06
    POSITIVE LOGITS
    DH
    0.07
     відбу
    0.07
    DNS
    0.06
     addChild
    0.06
    adoop
    0.06
    .Move
    0.06
    Portály
    0.06
    agoon
    0.06
    .makedirs
    0.06
    abbrev
    0.06
    Act Density 0.006%

    No Known Activations