INDEX
    Explanations

    This neuron isn’t detecting any particular pattern in these snippets—it remains inactive across all tokens.

    New Auto-Interp
    Negative Logits
     finest
    -0.07
     گفت
    -0.06
     groot
    -0.06
    REG
    -0.06
    .edit
    -0.06
    .sprites
    -0.06
     tourist
    -0.06
     cath
    -0.06
    iences
    -0.05
     القد
    -0.05
    POSITIVE LOGITS
    /manual
    0.08
    ío
    0.07
     نفسه
    0.07
     yaz
    0.06
    ()["
    0.06
     autos
    0.06
     nalez
    0.06
     Exercises
    0.06
    LOB
    0.06
    ůvod
    0.06
    Act Density 0.000%

    No Known Activations