INDEX
Explanations
suppressing information
The neuron fires on words denoting acts of suppression or silencing (e.g., “crush,” “suppress,” “conceal,” “silencing”).
New Auto-Interp
Negative Logits
_bit
-0.07
效
-0.06
','',
-0.06
Expired
-0.06
以
-0.06
serde
-0.06
toItem
-0.06
.branch
-0.06
ValuePair
-0.06
дис
-0.06
POSITIVE LOGITS
fx
0.07
labels
0.06
лению
0.06
Veterans
0.06
commerce
0.06
izen
0.06
Prim
0.06
dovol
0.06
assertion
0.06
refrigerator
0.06
Activations Density 0.020%