INDEX
    Model
    gemma-2-9b-it
    Layer #
    20
    Steering Hook
    blocks.20.hook_resid_pre
    Steering Strength
    81
    Uploader
    bot-neuronpedia
    Created At
    2/15/2025 1:06:43 AM
    Raw Vector
    Actions
    Explanations

    references to doors and their actions

    New Auto-Interp
    Negative Logits
    Voted
    -0.38
     nasional
    -0.37
     generali
    -0.37
    Retain
    -0.37
     Grunde
    -0.36
     Corazón
    -0.35
     mą
    -0.35
     barata
    -0.35
    스트
    -0.35
     kaas
    -0.35
    POSITIVE LOGITS
    LookAnd
    0.69
    CloseOperation
    0.68
    🚪
    0.60
     arrival
    0.60
     ujednoznacz
    0.60
     للاسماء
    0.59
     الحره
    0.58
    Хьажоргаш
    0.58
    uxxxx
    0.58
     doors
    0.58
    Act Density 0.002%

    No Known Activations