INDEX
    Explanations

    news articles

    The neuron activates on discourse‐level connective words (e.g. contrastive or causal transition markers like “despite,” “results,” “therefore,” etc.).

    New Auto-Interp
    Negative Logits
    ekt
    -0.07
     kinds
    -0.07
     wann
    -0.06
     WCS
    -0.06
     nights
    -0.06
     اغ
    -0.06
    -income
    -0.06
     systems
    -0.06
    ibration
    -0.06
     kind
    -0.06
    POSITIVE LOGITS
     knih
    0.06
     reinforcements
    0.06
     >",
    0.06
     ):
    0.06
    ـ
    0.06
    0.06
     BAL
    0.06
     //_
    0.06
    ruž
    0.06
    يكا
    0.06
    Act Density 0.090%

    No Known Activations