INDEX
    Explanations

    phrases related to clothing items and styles

    references to specific clothing and fashion items

    New Auto-Interp
    Negative Logits
     downstream
    -0.77
    onential
    -0.75
    rencies
    -0.74
    terness
    -0.72
     Nuclear
    -0.72
    ithmetic
    -0.70
    ETHOD
    -0.70
    Torrent
    -0.70
     Kumar
    -0.69
    uclear
    -0.68
    POSITIVE LOGITS
     worn
    1.52
     wardrobe
    1.26
     scarf
    1.26
     waist
    1.25
     adorned
    1.23
     trousers
    1.22
     sleeves
    1.21
     hairst
    1.21
     underwear
    1.18
     wore
    1.18
    Act Density 0.555%

    No Known Activations