INDEX
    Explanations

    The neuron is spotting hedging or uncertainty cues—words and phrases that flag experimental, tentative, or “we don’t know” language.

    New Auto-Interp
    Negative Logits
     tone
    -0.07
     oats
    -0.07
    cap
    -0.07
    posites
    -0.06
    -0.06
     Machine
    -0.06
    ManyToOne
    -0.06
     expectancy
    -0.06
    .string
    -0.06
    งส
    -0.06
    POSITIVE LOGITS
    农业
    0.07
    uncate
    0.06
    体育
    0.06
    .GetSize
    0.06
     coun
    0.06
    _ARGUMENT
    0.06
    approximately
    0.06
    дина
    0.06
    olucion
    0.06
    orph
    0.06
    Act Density 0.102%

    No Known Activations