INDEX

Explanations

<THINKING> Method 1: MAX_ACTIVATING_TOKENS show many ATLAS/CMS patterns but no single preceding token pattern. Method 2: TOKENS_AFTER_MAX_ACTIVATING_TOKEN are varied (announced, data, experiments, etc.) – no clear commonality. Method 3: TOP_POSITIVE_LOGITS are all detector‑related terms (detector, detectors, detection, experiments, the Chinese words for experiment and detection). This clear common theme indicates the neuron predicts detector/experiment terminology. Method 4: TOP_ACTIVATING_TEXTS discuss particle physics experiments, further supporting the detector/experiment focus but not needed beyond Method 3. Conclusion: The neuron is tuned to detector/experiment words. Explanation: say detectors </THINKING> say detectors

New Auto-Interp

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

eval

-17.00

王国

-16.75

 loadImage

-16.25

errmsg

-16.25

autoload

-16.25

_thumbnail

-16.25

 vitae

-16.00

 Renault

-16.00

WRAPPER

-15.88

道路交通

-15.88

POSITIVE LOGITS

 detectors

19.75

 detector

19.38

实验

19.25

探测

18.75

 experiments

18.50

detector

18.25

 Detector

18.13

Detector

17.75

 detection

17.75

 detections

17.25

Activations Density 0.072%