INDEX
    Explanations

    references to fashion and style-related concepts.

    The neuron fires on the word “fashion” (and its morphological variants like “fashionable” or “fashionably”) in headlines and text.

    New Auto-Interp
    Negative Logits
    ريب
    -0.07
    لی
    -0.07
    2
    -0.07
     zombie
    -0.07
     enzyme
    -0.07
     зат
    -0.06
     cruc
    -0.06
     cube
    -0.06
    。
    ↵
    -0.06
    олод
    -0.06
    POSITIVE LOGITS
     fashion
    0.13
     Fashion
    0.10
     fashionable
    0.10
    -fashion
    0.09
    ashion
    0.08
     fashioned
    0.08
     admir
    0.08
    Fashion
    0.08
    afka
    0.07
     gossip
    0.07
    Act Density 0.005%

    No Known Activations