INDEX
    Explanations

    Security, safety, and risk

    This neuron detects mentions of personal or sensitive user information being shared (e.g. locations, dates of birth, children’s names, travel plans, photos).

    New Auto-Interp
    Negative Logits
    cre
    -0.07
    -0.07
     Pull
    -0.06
    ians
    -0.06
     vibe
    -0.06
     Office
    -0.06
     caregiver
    -0.06
     clr
    -0.06
    Crime
    -0.06
    -0.06
    POSITIVE LOGITS
     newly
    0.07
    ENDOR
    0.06
    orda
    0.06
    ология
    0.06
     دف
    0.06
     ontvangst
    0.06
    (of
    0.06
    instructions
    0.06
    NI
    0.06
     absolute
    0.06
    Act Density 0.178%

    No Known Activations