INDEX
    Explanations

    The neuron principally responds to mentions of “violation” (i.e. references to breaches of rules, contracts, or standards).

    New Auto-Interp
    Negative Logits
     competence
    -0.08
    模型
    -0.07
     العظ
    -0.06
     Alman
    -0.06
     hands
    -0.06
    (search
    -0.06
    James
    -0.06
     Consortium
    -0.06
     granny
    -0.06
     nắm
    -0.06
    POSITIVE LOGITS
     violate
    0.11
     viol
    0.11
     violation
    0.10
     violated
    0.10
     violations
    0.09
     violating
    0.09
     Viol
    0.09
    0.08
    wipe
    0.08
     violates
    0.08
    Act Density 0.011%

    No Known Activations