INDEX
    Explanations

    This neuron detects moderate performance qualifiers—especially the word “reasonable” (and similar hedging adjectives) describing acceptable specs.

    New Auto-Interp
    Negative Logits
     μο
    -0.07
    58
    -0.06
     Frid
    -0.06
     Radius
    -0.06
     utterly
    -0.06
     matched
    -0.06
     glyc
    -0.06
     commerc
    -0.06
     FormGroup
    -0.06
     pointless
    -0.06
    POSITIVE LOGITS
    .rev
    0.07
    |$
    0.06
     ja
    0.06
    '])){
    ↵
    0.06
     iVar
    0.06
     الوطني
    0.06
    stoup
    0.06
    .'↵↵
    0.06
     provocative
    0.06
    untu
    0.06
    Act Density 0.046%

    No Known Activations