INDEX
    Explanations

    The neuron fires on tokens related to “depend,” i.e. words or morphemes expressing dependence (e.g. depend, dependent, dependency).

    New Auto-Interp
    Negative Logits
     Carlton
    -0.07
    uzz
    -0.07
    RA
    -0.07
    Roll
    -0.07
    Color
    -0.07
     vua
    -0.07
    -0.07
     TX
    -0.07
     sco
    -0.06
     rfl
    -0.06
    POSITIVE LOGITS
     dependent
    0.13
    -dependent
    0.10
     Dep
    0.09
    Dep
    0.09
    pend
    0.08
    PEND
    0.08
    .dep
    0.08
     depend
    0.08
     dependence
    0.08
     Depend
    0.08
    Act Density 0.019%

    No Known Activations