INDEX
Model
gemma-2-9b-it
Layer #
20
Steering Hook
blocks.20.hook_resid_pre
Steering Strength
64
Uploader
bot-neuronpedia
Created At
2/15/2025 1:06:43 AM
Raw Vector
Actions
Explanations
words that indicate speculation or uncertainty
New Auto-Interp
Negative Logits
styleType
-0.70
celotti
-0.63
нгред
-0.63
esternos
-0.54
Kariera
-0.54
Houſe
-0.54
tyimages
-0.53
constitutive
-0.52
ロウィン
-0.52
redient
-0.52
POSITIVE LOGITS
might
0.65
if
0.51
would
0.50
could
0.50
might
0.48
Might
0.47
may
0.46
Might
0.46
可能會
0.44
pourrait
0.43
Activations Density 0.004%