INDEX
    Explanations

    the main thing this neuron does is detect occurrences of the substring “access” in tokens.

    New Auto-Interp
    Negative Logits
     Juliet
    -0.08
     glm
    -0.07
     диамет
    -0.07
     disastr
    -0.07
     seventeen
    -0.07
     hurricane
    -0.07
     Parade
    -0.07
    Ron
    -0.07
     Leonard
    -0.07
    27
    -0.07
    POSITIVE LOGITS
     access
    0.16
     Access
    0.15
    Access
    0.12
    access
    0.11
    ACCESS
    0.09
     accessing
    0.09
    _access
    0.09
     ACCESS
    0.09
    -access
    0.09
    .Access
    0.08
    Act Density 0.042%

    No Known Activations