INDEX
    Explanations

    This neuron specifically detects mentions of personal connections, most notably the phrase “friend of.”

    New Auto-Interp
    Negative Logits
     Mild
    -0.07
    Buscar
    -0.06
     Filip
    -0.06
     cmp
    -0.06
     představ
    -0.06
     як
    -0.06
    acas
    -0.06
    bsite
    -0.06
     fasting
    -0.06
     windshield
    -0.06
    POSITIVE LOGITS
    oring
    0.07
    IONS
    0.07
     Undo
    0.06
    ANTED
    0.06
    $status
    0.06
    edir
    0.06
    	game
    0.06
    .Qt
    0.06
     tink
    0.06
     issue
    0.06
    Act Density 0.016%

    No Known Activations