INDEX
    Explanations

    text snippets

    The neuron detects mentions of the “news article” context—i.e. tokens like “news,” “article,” and related summarization words.

    New Auto-Interp
    Negative Logits
    -Muslim
    -0.07
    igor
    -0.06
    .handleSubmit
    -0.06
     Booker
    -0.06
    spacer
    -0.06
     isAdmin
    -0.06
    departure
    -0.06
    iclass
    -0.06
    inventory
    -0.06
    $wp
    -0.06
    POSITIVE LOGITS
     Doom
    0.06
     slicing
    0.06
     древ
    0.06
     makes
    0.06
    ้จ
    0.06
     vào
    0.06
    Uno
    0.06
     muj
    0.06
     demographic
    0.06
    zyć
    0.06
    Act Density 0.001%

    No Known Activations