INDEX
    Explanations

    mentions of the name "Simon" at a high level of activation

    mentions of the name "Simon."

    New Auto-Interp
    Negative Logits
    olulu
    -0.73
    00200000
    -0.72
    reek
    -0.72
    late
    -0.71
    eals
    -0.70
    merce
    -0.69
    doors
    -0.67
    ktop
    -0.67
    rings
    -0.66
    laws
    -0.66
    POSITIVE LOGITS
     Simon
    1.16
    Simon
    1.08
     Says
    0.80
     Richie
    0.79
     Gerr
    0.79
     Fraser
    0.73
    irtual
    0.73
     Baron
    0.72
    zman
    0.72
     Baz
    0.72
    Act Density 0.007%

    No Known Activations