INDEX
Explanations
amazon clothing descriptions
New Auto-Interp
Negative Logits
...
0.61
v
0.58
$
0.58
...,
0.57
0.57
Ν
0.56
til
0.56
giant
0.56
N
0.55
нова
0.55
POSITIVE LOGITS
garments
1.15
sweatshirts
1.06
穿着
1.03
Clothing
1.02
Clothing
0.98
clothing
0.98
sweatshirt
0.96
prendas
0.95
เสื้อ
0.95
garment
0.94
Activations Density 0.002%