INDEX
    Explanations

    The neuron activates on occurrences of the name “Martin.”

    New Auto-Interp
    Negative Logits
     Zoo
    -0.07
     Yellow
    -0.07
    819
    -0.07
    879
    -0.07
     Encore
    -0.07
     Peach
    -0.07
     Face
    -0.06
     अक
    -0.06
     Byrne
    -0.06
    rible
    -0.06
    POSITIVE LOGITS
     Martin
    0.17
    Martin
    0.13
     martin
    0.11
     Marty
    0.09
     Craig
    0.08
     Lewis
    0.08
    TG
    0.08
     Martins
    0.08
    Ross
    0.08
     Gordon
    0.08
    Act Density 0.017%

    No Known Activations