INDEX
    Explanations

    Difficult situations

    This neuron activates on tokens expressing first-person perspective, particularly “I” and related self-referential words in personal statements.

    New Auto-Interp
    Negative Logits
    дут
    -0.06
     Boeing
    -0.06
    -0.06
     lowes
    -0.06
    -0.06
    cout
    -0.06
     Samantha
    -0.06
    _LS
    -0.06
    丁目
    -0.06
     mesa
    -0.06
    POSITIVE LOGITS
    fers
    0.07
    (""));↵
    0.07
    .compiler
    0.06
     Handbook
    0.06
     аллерг
    0.06
     musel
    0.06
     uomini
    0.06
    lovak
    0.06
     Fam
    0.06
    ileceği
    0.06
    Act Density 0.078%

    No Known Activations