INDEX
Explanations
Scientific Errors
The neuron activates on language typical of stating scientific results or predictions—words like “produce,” “correct,” “predict,” “spurious,” “transition,” and “forbidden” that describe theoretical findings or their validation.
New Auto-Interp
Negative Logits
igated
-0.08
онт
-0.06
ु�
-0.06
activated
-0.06
循
-0.06
affine
-0.06
chip
-0.06
ishops
-0.06
-cycle
-0.06
ug
-0.06
POSITIVE LOGITS
Output
0.07
Describe
0.07
pname
0.07
Pere
0.07
Beard
0.06
[,
0.06
ErrorException
0.06
/{{0.06
бар
0.06
Statements
0.06
Activations Density 0.031%