INDEX
    Explanations

    The neuron activates on words that denote political sovereignty or status—e.g. “independent,” “separate,” “nation,” “country,” and “state.”

    New Auto-Interp
    Negative Logits
     incidence
    -0.06
     sunset
    -0.06
    tank
    -0.06
     weit
    -0.06
    下的
    -0.06
     počtu
    -0.06
     Purs
    -0.06
     Bring
    -0.06
    angement
    -0.06
     disse
    -0.06
    POSITIVE LOGITS
     stratej
    0.07
    0.07
     대해서
    0.07
     živ
    0.07
     그래서
    0.07
    Installer
    0.06
    pdev
    0.06
     sidelined
    0.06
    ném
    0.06
     Lum
    0.06
    Act Density 0.017%

    No Known Activations