INDEX
    Explanations

    The neuron primarily detects occurrences of the word “discretion” (and its close variants in administrative/legal contexts).

    New Auto-Interp
    Negative Logits
    -0.07
    _E
    -0.06
    -0.06
     Harness
    -0.06
    体系
    -0.06
     міс
    -0.06
     thirteen
    -0.06
     є
    -0.06
     numero
    -0.06
     Πα
    -0.06
    POSITIVE LOGITS
     discretion
    0.15
     discretionary
    0.09
    0.07
    ')")↵
    0.07
    BagConstraints
    0.07
    warn
    0.07
    0.07
     fingert
    0.07
     disappoint
    0.07
    loon
    0.07
    Act Density 0.002%

    No Known Activations