INDEX
    Explanations

    The neuron activates on occurrences of the brand name “Google.”

    New Auto-Interp
    Negative Logits
    recipe
    -0.07
    ourses
    -0.07
     Boards
    -0.07
    іт
    -0.07
     bake
    -0.06
     deserve
    -0.06
    avings
    -0.06
    macro
    -0.06
     bathrooms
    -0.06
    ident
    -0.06
    POSITIVE LOGITS
     Tesla
    0.07
    ений
    0.06
     Lemma
    0.06
    .Fatal
    0.06
    스를
    0.06
     slamming
    0.06
    ?v
    0.05
    ,x
    0.05
     Toyota
    0.05
     خدمات
    0.05
    Act Density 0.007%

    No Known Activations