INDEX
    Model
    gemma-2-9b-it
    Layer #
    20
    Steering Hook
    blocks.20.hook_resid_pre
    Steering Strength
    67
    Uploader
    bot-neuronpedia
    Created At
    2/15/2025 1:06:43 AM
    Raw Vector
    Actions
    Explanations

    expressions of gratitude and appreciation

    New Auto-Interp
    Negative Logits
     AssemblyCulture
    -0.64
    Personendaten
    -0.62
    лтемелер
    -0.58
    MessageOf
    -0.57
     ivelany
    -0.56
     IFTT
    -0.54
    tanleria
    -0.54
    IRUS
    -0.54
     diagnose
    -0.53
    出版年
    -0.53
    POSITIVE LOGITS
     appreciation
    0.51
     thank
    0.49
    Agrade
    0.45
     agradecimiento
    0.45
     Thank
    0.45
     gratitude
    0.45
     appreciated
    0.44
     thanked
    0.44
     grateful
    0.43
     appreciate
    0.42
    Act Density 0.000%

    No Known Activations