INDEX
    Explanations

    This neuron activates on occurrences of the substring “intrusive” (e.g. the “usive” part in “intrusive”).

    New Auto-Interp
    Negative Logits
    起来
    -0.06
    733
    -0.06
     monks
    -0.06
    _closure
    -0.06
    anc
    -0.06
    -0.05
    そうな
    -0.05
    cyan
    -0.05
    орів
    -0.05
     /////
    -0.05
    POSITIVE LOGITS
    tex
    0.07
    ease
    0.07
    oad
    0.07
    uper
    0.06
    イン
    0.06
    _MAX
    0.06
    );}↵↵
    0.06
    σσ
    0.06
    /ex
    0.06
    маг
    0.06
    Act Density 0.000%

    No Known Activations