INDEX
    Explanations

    The neuron fires on words that denote exerting control or giving commands.

    New Auto-Interp
    Negative Logits
     bet
    -0.07
    (news
    -0.07
    -0.06
     noting
    -0.06
     Winners
    -0.06
     tặng
    -0.06
    BODY
    -0.06
    -0.06
    ه
    -0.06
     lunch
    -0.06
    POSITIVE LOGITS
     overlapping
    0.07
    िसक
    0.06
    ござ
    0.06
     státu
    0.06
    0.06
     Compilation
    0.06
    .setStyleSheet
    0.06
     hierarchy
    0.06
     findById
    0.06
    ุรก
    0.06
    Act Density 0.012%

    No Known Activations