INDEX
    Explanations

    This neuron steadily increases its activation the further a token is from the start of the text, effectively acting as a positional counter that detects how deep you are into the document.

    New Auto-Interp
    Negative Logits
    _job
    -0.07
     benchmark
    -0.07
    median
    -0.07
    -log
    -0.07
     Philippines
    -0.06
     Policy
    -0.06
    lav
    -0.06
    zw
    -0.06
    Policy
    -0.06
    leton
    -0.06
    POSITIVE LOGITS
     іс
    0.07
     педагог
    0.07
     sexuales
    0.07
    ']}}</
    0.06
    (shader
    0.06
    0.06
    ‌ها
    0.06
    .dst
    0.06
    ()}</
    0.06
    RGBO
    0.06
    Act Density 0.046%

    No Known Activations