INDEX
    Explanations

    The neuron activates on tokens that describe machine‐learning model training or usage actions.

    New Auto-Interp
    Negative Logits
    -service
    -0.07
    -long
    -0.06
    Bit
    -0.06
     erection
    -0.06
    esktop
    -0.06
    302
    -0.06
    277
    -0.06
    copyright
    -0.06
    _layout
    -0.06
    .train
    -0.06
    POSITIVE LOGITS
     ورز
    0.07
     haline
    0.07
     entfer
    0.06
    名無し
    0.06
    _MINOR
    0.06
    íveis
    0.06
     Lean
    0.06
    fulWidget
    0.06
     why
    0.06
    +"</
    0.06
    Act Density 0.046%

    No Known Activations