INDEX
    Explanations

    Action figures

    The neuron fires on mentions of collectible action‐figure products (e.g. “figure,” “figures,” “action figure,” “vintage collection,” etc.).

    New Auto-Interp
    Negative Logits
    -0.06
     hj
    -0.06
     عاشق
    -0.06
    /design
    -0.06
    _relation
    -0.06
    Das
    -0.06
     Smoking
    -0.06
    -0.06
     Raised
    -0.05
     dubbed
    -0.05
    POSITIVE LOGITS
     ROUND
    0.07
    0.07
    iddet
    0.07
    ayet
    0.07
    ζει
    0.06
    empt
    0.06
    ;");↵
    0.06
    reachable
    0.06
    ニニ
    0.06
    rich
    0.06
    Act Density 0.014%

    No Known Activations