INDEX
Explanations
references to fashion and style-related concepts.
The neuron fires on the word “fashion” (and its morphological variants like “fashionable” or “fashionably”) in headlines and text.
New Auto-Interp
Negative Logits
ريب
-0.07
لی
-0.07
2
-0.07
zombie
-0.07
enzyme
-0.07
зат
-0.06
cruc
-0.06
cube
-0.06
。 ↵
-0.06
олод
-0.06
POSITIVE LOGITS
fashion
0.13
Fashion
0.10
fashionable
0.10
-fashion
0.09
ashion
0.08
fashioned
0.08
admir
0.08
Fashion
0.08
afka
0.07
gossip
0.07
Activations Density 0.005%