INDEX
Explanations
This neuron activates on mentions of “ceramic” (and related forms like “ceramics”) in the text.
New Auto-Interp
Negative Logits
Subjects
-0.08
,alpha
-0.07
InOut
-0.06
Anth
-0.06
Save
-0.06
southwest
-0.06
sunshine
-0.06
_:*
-0.06
منتشر
-0.06
toolbar
-0.06
POSITIVE LOGITS
ceramic
0.11
Ceramic
0.11
ceramics
0.09
edian
0.07
porcelain
0.07
yat
0.07
cat
0.07
hayal
0.07
amik
0.07
cosmetic
0.07
Activations Density 0.003%