INDEX
    Explanations

    collaborations

    The neuron activates on tokens describing the model’s provenance—i.e. mentions of its development or joint training by institutions (like “developed,” “trained,” institution names, and dates).

    New Auto-Interp
    Negative Logits
     Haupt
    -0.07
     هفت
    -0.07
    arring
    -0.06
    paring
    -0.06
     Bearings
    -0.06
     пен
    -0.06
     HID
    -0.06
    wash
    -0.06
     kort
    -0.06
    andest
    -0.06
    POSITIVE LOGITS
    (fs
    0.08
    <li
    0.06
    '])?
    0.06
    acea
    0.06
     nostalg
    0.06
    .Protocol
    0.06
    >>>(
    0.06
    341
    0.06
     acompañ
    0.06
     missed
    0.06
    Act Density 0.004%

    No Known Activations