© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    Natural Language
    Autoencoders
    NEW
    Assistant AxisNEWCircuit TracerUPDATESteerSAE EvalsExportsAPI Community BlogPrivacy & TermsContact
    1. Home
    2. Qwen3-1.7B
    3. 27-LLAMASCOPE-2-LORSA-16K-K64
    4. 16243
    Prev
    Next
    INDEX
    Explanations

    <THINKING> Method 1: MAX_ACTIVATING_TOKENS show many ATLAS/CMS patterns but no single preceding token pattern. Method 2: TOKENS_AFTER_MAX_ACTIVATING_TOKEN are varied (announced, data, experiments, etc.) – no clear commonality. Method 3: TOP_POSITIVE_LOGITS are all detector‑related terms (detector, detectors, detection, experiments, the Chinese words for experiment and detection). This clear common theme indicates the neuron predicts detector/experiment terminology. Method 4: TOP_ACTIVATING_TEXTS discuss particle physics experiments, further supporting the detector/experiment focus but not needed beyond Method 3. Conclusion: The neuron is tuned to detector/experiment words. Explanation: say detectors </THINKING> say detectors

    unknown · unknown
    New Auto-Interp
    Top Features by Cosine Similarity
    Embeds
    IFrame
    Link
    Not in Any Lists

    No Comments

    Negative Logits
    eval
    -17.00
    王国
    -16.75
     loadImage
    -16.25
    errmsg
    -16.25
    autoload
    -16.25
    _thumbnail
    -16.25
     vitae
    -16.00
     Renault
    -16.00
    WRAPPER
    -15.88
    道路交通
    -15.88
    POSITIVE LOGITS
     detectors
    19.75
     detector
    19.38
    实验
    19.25
    探测
    18.75
     experiments
    18.50
    detector
    18.25
     Detector
    18.13
    Detector
    17.75
     detection
    17.75
     detections
    17.25
    Activations Density 0.072%

    No Known Activations