INDEX
    Explanations

    The neuron selectively fires on occurrences of the word “center” (especially in forms like “centered”).

    New Auto-Interp
    Negative Logits
     Ansi
    -0.07
    Архів
    -0.06
    acking
    -0.06
     acı
    -0.06
     discussed
    -0.06
     INDEX
    -0.06
    ax
    -0.06
     '_
    -0.06
     still
    -0.06
    ुमत
    -0.06
    POSITIVE LOGITS
    #
    0.06
    ¯¯
    0.06
    '][$
    0.06
     explodes
    0.06
     trough
    0.06
    OfDay
    0.06
     dva
    0.06
    ).(
    0.06
     dilig
    0.06
     Tv
    0.06
    Act Density 0.192%

    No Known Activations