INDEX
    Explanations

    The neuron is detecting tokens that form Google developer documentation URLs (e.g. “https://developers.google.com/...”).

    New Auto-Interp
    Negative Logits
    ruits
    -0.07
    captures
    -0.07
    (World
    -0.07
     Accred
    -0.07
     antes
    -0.06
    /close
    -0.06
    .maven
    -0.06
    autos
    -0.06
     اجرا
    -0.06
    sprites
    -0.06
    POSITIVE LOGITS
     Obama
    0.06
     Dense
    0.06
    0.06
    ‬↵
    0.06
    ‌ان
    0.06
     cháy
    0.06
     weighted
    0.06
    ้ม
    0.06
    aneous
    0.06
    UMENT
    0.06
    Act Density 0.004%

    No Known Activations