INDEX
Model
gemma-2-9b-it
Layer #
20
Steering Hook
blocks.20.hook_resid_pre
Steering Strength
74.5
Uploader
bot-neuronpedia
Created At
2/15/2025 1:06:43 AM
Raw Vector
Actions
Explanations
various types of punctuation and symbols used in written language
New Auto-Interp
Negative Logits
Ανακτήθηκε
-0.47
Appellee
-0.42
IGENCE
-0.42
nahilalakip
-0.38
départ
-0.38
itinéraires
-0.38
terme
-0.38
estacks
-0.37
możli
-0.37
>')
-0.37
POSITIVE LOGITS
0.52
purpoſe
0.49
Taktlose
0.47
+#+#
0.46
ſelf
0.45
#+#
0.45
punctuation
0.45
raiſ
0.44
ſta
0.44
Хьажоргаш
0.41
Activations Density 3.808%