INDEX
    Explanations

    This neuron identifies tokens related to attributing information to a source—words like “input,” “from,” “made,” or “feedback” that signal the origin of reported data or contributions.

    New Auto-Interp
    Negative Logits
     reduces
    -0.06
     serpent
    -0.06
    clf
    -0.06
     suffer
    -0.06
    .diff
    -0.06
     devil
    -0.06
     duck
    -0.06
    ssize
    -0.06
    luluk
    -0.06
    очь
    -0.06
    POSITIVE LOGITS
     inputs
    0.08
     input
    0.08
     Input
    0.07
    0.07
     выход
    0.07
     Solar
    0.07
    >\<
    0.07
    Fonts
    0.06
    FL
    0.06
    inputs
    0.06
    Act Density 0.010%

    No Known Activations