INDEX
    Explanations

    reviews and articles

    This neuron detects mentions of the language model’s developing organization or system identifier (e.g., “Large Model Systems Organization (LMSYS)”).

    New Auto-Interp
    Negative Logits
    -simple
    -0.06
     cellar
    -0.06
    _ob
    -0.06
    -python
    -0.06
    Suc
    -0.06
    -0.06
    -0.06
     ""),
    -0.06
    ーブル
    -0.06
     elő
    -0.06
    POSITIVE LOGITS
     interface
    0.07
     –↵↵
    0.06
    (logging
    0.06
     Tele
    0.06
    (guild
    0.06
    '
    ↵
    0.06
    ↵        
    ↵
    0.06
     součas
    0.06
    ROME
    0.06
    dete
    0.06
    Act Density 0.004%

    No Known Activations