INDEX
    Explanations

    instructions and technical writing

    logical inconsistencies in claims about the presence of wildfire smoke or flames in images.

    This neuron activates on text where the model refers to its own reasoning—especially phrases like “your thought process.”

    New Auto-Interp
    Negative Logits
     enf
    -0.08
     menuItem
    -0.07
    Hold
    -0.06
     signUp
    -0.06
     vazgeç
    -0.06
     promotes
    -0.06
     meat
    -0.06
     Meat
    -0.06
    ],
    ↵
    -0.06
    Ingredients
    -0.06
    POSITIVE LOGITS
    (currency
    0.07
    operations
    0.07
    0.07
     poco
    0.07
    inars
    0.07
     difficulty
    0.06
    locator
    0.06
     Voyage
    0.06
    гляд
    0.06
    /components
    0.06
    Act Density 0.006%

    No Known Activations