INDEX
    Model
    gemma-2-9b-it
    Layer #
    20
    Steering Hook
    blocks.20.hook_resid_pre
    Steering Strength
    61
    Uploader
    bot-neuronpedia
    Created At
    2/15/2025 1:06:43 AM
    Raw Vector
    Actions
    Explanations

    instances of the word "evaluations."

    New Auto-Interp
    Negative Logits
    Personendaten
    -0.74
     AssemblyCulture
    -0.57
    Smarty
    -0.56
    redient
    -0.51
    WriteTagHelper
    -0.48
     SYLLABLE
    -0.48
     barata
    -0.46
    ècie
    -0.45
     cheaper
    -0.45
     ویکی‌پدیای
    -0.44
    POSITIVE LOGITS
     evaluation
    0.68
     evalu
    0.56
     Evaluation
    0.56
     evaluate
    0.55
    Evaluation
    0.54
     evaluations
    0.54
     evaluating
    0.51
     evaluated
    0.50
     avaliação
    0.50
    評価
    0.49
    Act Density 0.000%

    No Known Activations