INDEX
    Explanations

    spies or traitors

    This neuron responds to words indicating that someone has been replaced or is an imposter infiltrating a group.

    New Auto-Interp
    Negative Logits
     submar
    -0.07
    igration
    -0.07
     müdür
    -0.06
     storms
    -0.06
    -0.06
    esteem
    -0.06
     درست
    -0.06
    (*)(
    -0.06
     ricerca
    -0.06
    nict
    -0.06
    POSITIVE LOGITS
    _quad
    0.06
    _PIX
    0.06
     інш
    0.06
    .addAll
    0.06
     CGPoint
    0.06
    xmlns
    0.06
     sélection
    0.06
     مبانی
    0.06
    ━�
    0.06
     Yer
    0.06
    Act Density 0.021%

    No Known Activations