INDEX
    Explanations

    The neuron is triggered by occurrences of the word “cancel” (and its morphological variants such as “cancelation,” “cancellations,” etc.).

    New Auto-Interp
    Negative Logits
    (obs
    -0.07
    ROUT
    -0.07
    ,image
    -0.06
     Xunit
    -0.06
    Soft
    -0.06
    -0.06
     Wei
    -0.06
     Obt
    -0.06
                                                                             
    -0.06
    .conditions
    -0.06
    POSITIVE LOGITS
     cancel
    0.13
     Cancel
    0.12
    Cancel
    0.11
    _cancel
    0.11
     cancell
    0.11
    cancel
    0.10
     cancelled
    0.10
     canceled
    0.10
    (cancel
    0.10
    Canceled
    0.09
    Act Density 0.006%

    No Known Activations