© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    Natural Language
    Autoencoders
    NEW
    Assistant AxisNEWCircuit TracerUPDATESteerSAE EvalsExportsAPI Community BlogPrivacy & TermsContact
    1. Home
    2. Qwen3-1.7B
    3. 26-LLAMASCOPE-2-LORSA-16K-K64
    4. 496
    Prev
    Next
    INDEX
    Explanations

    say "race"

    unknown · unknown
    New Auto-Interp
    Top Features by Cosine Similarity
    Embeds
    IFrame
    Link
    Not in Any Lists

    No Comments

    Negative Logits
    HK
    -20.00
    jsp
    -17.88
    NV
    -17.25
     HK
    -16.63
    EPS
    -16.38
    gz
    -16.25
    jon
    -16.13
    桂
    -16.13
    mdl
    -16.00
    MG
    -15.81
    POSITIVE LOGITS
    种族
    36.00
     racial
    28.38
     racially
    27.75
     race
    26.63
    rac
    24.63
    racial
    24.50
    Race
    24.38
     Rac
    24.38
     races
    23.63
    race
    23.38
    Activations Density 0.245%

    No Known Activations