INDEX
    Explanations

    The neuron fires on adverbs indicating a negative statistical correlation (e.g. “negatively” in “negatively correlated”).

    New Auto-Interp
    Negative Logits
    aine
    -0.08
    Neighbor
    -0.07
     사항
    -0.07
     Owens
    -0.07
    para
    -0.07
     sammen
    -0.07
     Також
    -0.07
    ooke
    -0.07
    函数
    -0.07
    Ban
    -0.07
    POSITIVE LOGITS
     removeAll
    0.06
     dial
    0.06
    _nav
    0.06
    _RANDOM
    0.06
     inclusive
    0.06
    查询
    0.06
    -standing
    0.06
     getC
    0.06
     وظ
    0.06
     Marlins
    0.06
    Act Density 0.005%

    No Known Activations