INDEX
    Explanations

    This neuron selectively activates on machine-learning jargon—especially references to “model,” “pre-trained,” “fine-tuning,” and similar training-related terms.

    New Auto-Interp
    Negative Logits
     أنا
    -0.08
     SZ
    -0.07
    _REPORT
    -0.07
     similarities
    -0.06
     mushrooms
    -0.06
    їна
    -0.06
    Rating
    -0.06
    given
    -0.06
    Amb
    -0.06
    veget
    -0.06
    POSITIVE LOGITS
    ับร
    0.06
     Nano
    0.06
    herit
    0.06
    0.06
    828
    0.06
     závě
    0.06
     subscriptions
    0.06
    full
    0.06
    .Global
    0.06
    0.06
    Act Density 0.016%

    No Known Activations