INDEX
    Explanations

    The neuron fires on words and phrases used to judge factual consistency—e.g. “fact,” “factually,” “consistent,” and “inconsistent.”

    phrases indicating the action of requesting or asking someone to do something.

    New Auto-Interp
    Negative Logits
    .LA
    -0.07
    _y
    -0.07
    ρία
    -0.07
     pog
    -0.06
    eryl
    -0.06
    网址
    -0.06
     slam
    -0.06
     Commons
    -0.06
     Penalty
    -0.06
     goog
    -0.06
    POSITIVE LOGITS
     marched
    0.06
    {lng
    0.06
     nominated
    0.06
    ньої
    0.06
    าหาร
    0.06
     mạch
    0.06
     castle
    0.06
     wildfire
    0.05
    .scalajs
    0.05
    .series
    0.05
    Act Density 0.020%

    No Known Activations