INDEX
    Explanations

    The neuron activates on tokens naming vehicle body‐location parts—especially “rear” (and occasionally “front”) references.

    New Auto-Interp
    Negative Logits
    Leaks
    -0.07
    World
    -0.07
    NUM
    -0.06
     الحديث
    -0.06
     repeatedly
    -0.06
    ,the
    -0.06
    ился
    -0.06
    وت
    -0.06
    X
    -0.06
     Idol
    -0.06
    POSITIVE LOGITS
     front
    0.07
     prat
    0.07
    PosY
    0.06
     Scientology
    0.06
     vicinity
    0.06
    .Region
    0.06
     classname
    0.06
     Auth
    0.06
    icago
    0.06
    0.06
    Act Density 0.005%

    No Known Activations