INDEX
    Explanations

    This neuron detects mentions of “Steve Jobs.”

    New Auto-Interp
    Negative Logits
    ена
    -0.07
    ает
    -0.07
    .description
    -0.07
     yellow
    -0.06
    .Up
    -0.06
    .Field
    -0.06
    ि�
    -0.06
    ’un
    -0.06
    -0.06
    edores
    -0.06
    POSITIVE LOGITS
    agnost
    0.06
    teness
    0.06
     hade
    0.06
     ';↵↵
    0.06
     всей
    0.06
    包含
    0.06
     Messaging
    0.06
     preach
    0.06
     StyleSheet
    0.05
     Doom
    0.05
    Act Density 0.005%

    No Known Activations