INDEX
    Explanations

    The neuron activates on relative‐clause trigger words—especially the phrase “those who.”

    New Auto-Interp
    Negative Logits
     releases
    -0.07
    -0.07
    .RadioButton
    -0.06
     epochs
    -0.06
    Mapper
    -0.06
     prints
    -0.06
    Director
    -0.06
    -0.06
     touching
    -0.06
     econom
    -0.06
    POSITIVE LOGITS
    -cigaret
    0.06
    _RESULT
    0.06
     searchData
    0.06
    _wrong
    0.06
     jj
    0.06
    _CPP
    0.06
     риз
    0.06
     Convenience
    0.06
    TEMPL
    0.06
    _UNSUPPORTED
    0.06
    Act Density 0.028%

    No Known Activations