INDEX
    Explanations

    This neuron is primarily triggered by the word “changes.”

    New Auto-Interp
    Negative Logits
    fort
    -0.08
     Petroleum
    -0.08
     ppl
    -0.07
     prefab
    -0.07
     Elliot
    -0.07
     Elliott
    -0.07
    (ViewGroup
    -0.06
     Txt
    -0.06
     Roosevelt
    -0.06
     Eagles
    -0.06
    POSITIVE LOGITS
    Changes
    0.09
     changes
    0.09
     Changes
    0.08
     change
    0.08
    as
    0.07
    стро
    0.07
    imize
    0.07
    -care
    0.07
    MS
    0.07
    行为
    0.07
    Act Density 0.028%

    No Known Activations