INDEX
Model
gemma-2-9b-it
Layer #
20
Steering Hook
blocks.20.hook_resid_pre
Steering Strength
68
Uploader
bot-neuronpedia
Created At
2/15/2025 1:06:43 AM
Raw Vector
Actions
Explanations
terms related to cybersecurity and security risks
New Auto-Interp
Negative Logits
HttpFoundation
-0.48
himo
-0.46
Smarty
-0.42
httphttps
-0.39
ctrons
-0.38
Hauptartikel
-0.38
ounder
-0.38
celotti
-0.38
SYLLABLE
-0.38
oznam
-0.37
POSITIVE LOGITS
security
0.66
Security
0.65
Security
0.65
security
0.62
SECURITY
0.57
fromnode
0.57
SECURITY
0.52
ValueStyle
0.50
Chham
0.47
Seguridad
0.46
Activations Density 0.035%