INDEX
    Explanations

    code/quotes/conversation

    The neuron responds to mentions of “respecting other’s beliefs,” i.e. phrases about respecting others’ beliefs.

    New Auto-Interp
    Negative Logits
    authentication
    -0.07
     discriminator
    -0.07
    _instructions
    -0.07
    Networking
    -0.06
    #
    -0.06
     ++↵
    -0.06
    .kr
    -0.06
    Flight
    -0.06
     sep
    -0.06
    stk
    -0.06
    POSITIVE LOGITS
     Incident
    0.06
     danh
    0.06
     Deadpool
    0.06
     شر
    0.06
    _LO
    0.06
    initely
    0.06
    .Last
    0.06
     kommun
    0.06
    NF
    0.06
     آ
    0.06
    Act Density 0.001%

    No Known Activations