INDEX
Explanations
references to hats and headwear
New Auto-Interp
Negative Logits
clothes
-0.16
Hou
-0.15
endon
-0.15
Clothes
-0.15
dresses
-0.15
Clothing
-0.14
_skin
-0.14
Hund
-0.14
skin
-0.14
hyp
-0.14
POSITIVE LOGITS
hat
0.76
hats
0.69
Hat
0.67
hat
0.62
Hat
0.60
帽
0.59
Hats
0.55
_hat
0.49
cap
0.49
caps
0.45
Activations Density 0.138%