INDEX
    Model
    gemma-2-9b-it
    Layer #
    20
    Steering Hook
    blocks.20.hook_resid_pre
    Steering Strength
    68.5
    Uploader
    bot-neuronpedia
    Created At
    2/15/2025 1:06:43 AM
    Raw Vector
    Actions
    Explanations

    mentions of the World Health Organization (WHO)

    New Auto-Interp
    Negative Logits
    Personendaten
    -0.36
     itinéraires
    -0.36
     tank
    -0.35
     bounced
    -0.35
     replacement
    -0.35
    replacement
    -0.34
     kick
    -0.34
     nutshell
    -0.34
     finition
    -0.33
     hates
    -0.33
    POSITIVE LOGITS
     himſelf
    0.60
     itſelf
    0.60
     purpoſe
    0.59
    ſelf
    0.57
     myſelf
    0.56
    ſelves
    0.55
    ValueStyle
    0.54
    wiſe
    0.54
    leſs
    0.53
     ſta
    0.52
    Act Density 0.000%

    No Known Activations