INDEX
    Explanations

    The neuron activates on normative requirement language—phrases stating what “must” or “should” be done or “at least” needs to be present.

    New Auto-Interp
    Negative Logits
     Best
    -0.07
     employees
    -0.07
     आई
    -0.07
    User
    -0.07
     внимание
    -0.06
    -plane
    -0.06
     owning
    -0.06
    Hide
    -0.06
     Employees
    -0.06
    Pi
    -0.06
    POSITIVE LOGITS
    чи
    0.07
    ING
    0.07
     disparate
    0.07
    инг
    0.07
     extravag
    0.07
     صند
    0.06
    rico
    0.06
     cer
    0.06
     territorial
    0.06
    0.06
    Act Density 0.016%

    No Known Activations