INDEX
    Model
    gemma-2-9b-it
    Layer #
    20
    Steering Hook
    blocks.20.hook_resid_pre
    Steering Strength
    74
    Uploader
    bot-neuronpedia
    Created At
    2/15/2025 1:06:43 AM
    Raw Vector
    Actions
    Explanations

    words related to positive emotions and enjoyable experiences

    New Auto-Interp
    Negative Logits
    WriteTagHelper
    -0.51
    bootstrapcdn
    -0.51
    Personendaten
    -0.50
     theoretically
    -0.50
     Jazeera
    -0.49
    sistors
    -0.49
    tyimages
    -0.47
    zeera
    -0.46
     Sheet
    -0.46
     Sheets
    -0.46
    POSITIVE LOGITS
     enjoyment
    0.63
    enjoy
    0.58
     joyful
    0.57
     enjoy
    0.56
    Enjoy
    0.56
     Enjoy
    0.56
     enjoyable
    0.56
     joy
    0.52
     gioia
    0.51
     enjoyed
    0.50
    Act Density 0.001%

    No Known Activations