INDEX
    Explanations

    This neuron activates on terms that denote an overall or aggregate measure—words like “whole,” “overall,” “system,” “performance,” or “volume.”

    New Auto-Interp
    Negative Logits
     ấm
    -0.08
     менее
    -0.07
     tak
    -0.07
     Jackson
    -0.07
    aver
    -0.06
    Ingredient
    -0.06
    attention
    -0.06
    aceutical
    -0.06
    -0.06
    -0.06
    POSITIVE LOGITS
    =p
    0.07
    :v
    0.06
    (resources
    0.06
    (V
    0.06
    +t
    0.06
    idea
    0.06
    ,s
    0.06
    .Write
    0.06
     ORDER
    0.06
    .segments
    0.06
    Act Density 0.025%

    No Known Activations