INDEX
    Model
    gemma-2-9b-it
    Layer #
    20
    Steering Hook
    blocks.20.hook_resid_pre
    Steering Strength
    69
    Uploader
    bot-neuronpedia
    Created At
    2/15/2025 1:06:43 AM
    Raw Vector
    Actions
    Explanations

    concepts related to carefulness and responsibility

    New Auto-Interp
    Negative Logits
     Wikimedijinoj
    -0.64
    MessageOf
    -0.60
     IFTT
    -0.60
     Paglinawan
    -0.58
    intios
    -0.56
    Personendaten
    -0.56
     ویکی‌پدیای
    -0.53
     AssemblyCulture
    -0.52
    WriteTagHelper
    -0.50
    tyimages
    -0.48
    POSITIVE LOGITS
     caution
    0.41
     deseo
    0.40
    efois
    0.38
     voul
    0.38
     respect
    0.38
     desire
    0.38
     necesidad
    0.36
     concerns
    0.36
     Anliegen
    0.36
     want
    0.36
    Act Density 0.001%

    No Known Activations