INDEX
    Explanations

    mentions of T-shirts with specific characteristics or messages

    New Auto-Interp
    Negative Logits
     minimized
    -0.71
    theless
    -0.69
     pim
    -0.66
     prof
    -0.65
     degraded
    -0.65
     wiret
    -0.63
     rhy
    -0.61
     quar
    -0.60
     halluc
    -0.60
     sanct
    -0.60
    POSITIVE LOGITS
    shirt
    1.05
    shirts
    1.01
    rex
    0.92
    minus
    0.90
    iron
    0.90
    agonist
    0.90
    adic
    0.86
    level
    0.85
    squ
    0.82
    eye
    0.81
    Act Density 0.052%

    No Known Activations