INDEX
Explanations
Explanations and contextual notes
The neuron activates on uppercase abbreviations or acronyms (e.g., EHR, GIL) in the text.
New Auto-Interp
Negative Logits
Capability
-0.06
uminum
-0.06
-inflammatory
-0.06
eerie
-0.06
----------------------------------------------------------------------↵
-0.06
辅
-0.06
irgend
-0.06
(tex
-0.06
Optionally
-0.06
comeback
-0.06
POSITIVE LOGITS
charms
0.08
erving
0.07
,char
0.07
recognizing
0.07
atti
0.07
heid
0.06
PosX
0.06
(define
0.06
crate
0.06
처럼
0.06
Activations Density 0.055%