INDEX
    Explanations

    This neuron activates on instructions asking for additional examples (e.g. “add a few more examples to the list”).

    New Auto-Interp
    Negative Logits
     Mathematic
    -0.07
    compat
    -0.07
     jiných
    -0.07
     цю
    -0.07
     TPP
    -0.06
     英语
    -0.06
    please
    -0.06
     timer
    -0.06
     budeme
    -0.06
    -0.06
    POSITIVE LOGITS
     Stem
    0.07
     nepří
    0.07
    ations
    0.06
     oracle
    0.06
    ORIZ
    0.06
    空间
    0.06
     tarım
    0.06
     ids
    0.06
     balık
    0.06
    agram
    0.06
    Act Density 0.000%

    No Known Activations