INDEX
    Explanations

    Introductions/Overviews

    The neuron fires on hedging or qualifying language—words and phrases that express uncertainty, difficulty, or broad generalizations (e.g. “heavily,” “variety of factors,” “difficult to identify”).

    New Auto-Interp
    Negative Logits
    nr
    -0.07
     reactionary
    -0.07
     tuple
    -0.06
    ωσε
    -0.06
     преж
    -0.06
     KB
    -0.06
     Tate
    -0.06
    _steps
    -0.06
     beer
    -0.06
     Ward
    -0.06
    POSITIVE LOGITS
    оком
    0.07
    _boolean
    0.06
     unthinkable
    0.06
     precisely
    0.06
    ляют
    0.06
     Disaster
    0.06
     physicists
    0.06
    .clientY
    0.06
     материалов
    0.06
     своє
    0.06
    Act Density 0.199%

    No Known Activations