INDEX
Explanations
definitions and clarifications
This neuron detects meta‐linguistic qualifier and definitional phrasing—words like “sometimes,” “often,” “commonly,” “referred,” “term,” etc., that appear in parenthetical or appositive explanations introducing alternate names or definitions.
New Auto-Interp
Negative Logits
incurred
-0.07
تقو
-0.07
ests
-0.06
_INV
-0.06
鬼
-0.06
LTD
-0.06
Appe
-0.06
playable
-0.06
BTS
-0.06
وات
-0.06
POSITIVE LOGITS
惊
0.08
riches
0.07
Professional
0.06
přem
0.06
popover
0.06
defining
0.06
markup
0.06
.transforms
0.06
↵↵↵↵↵↵↵↵↵↵↵
0.06
promin
0.06
Activations Density 0.041%