INDEX
Model
gemma-2-9b-it
Layer #
20
Steering Hook
blocks.20.hook_resid_pre
Steering Strength
66.5
Uploader
bot-neuronpedia
Created At
2/15/2025 1:06:43 AM
Raw Vector
Actions
Explanations
references to health or health-related topics
New Auto-Interp
Negative Logits
bentar
-0.35
*)
-0.29
kandang
-0.29
)})
-0.27
pungkas
-0.27
↑
-0.26
)
-0.26
'
-0.25
probably
-0.25
hands
-0.25
POSITIVE LOGITS
ValueStyle
0.80
rrggbb
0.75
myſelf
0.73
səhifə
0.72
RegressionTest
0.72
health
0.72
purpoſe
0.72
itſelf
0.71
ſelves
0.69
pleaſure
0.69
Activations Density 0.000%