INDEX
Explanations
clothing items, specifically T-shirts
references to T-shirts
New Auto-Interp
Negative Logits
Reviewer
-0.77
audi
-0.71
«ĺ
-0.70
ntil
-0.69
judicial
-0.66
presided
-0.63
Aud
-0.62
ļ
-0.62
distant
-0.62
moving
-0.61
POSITIVE LOGITS
shirt
1.42
shirts
1.28
hirt
1.09
shirts
1.06
shirt
0.97
leeve
0.89
Shirt
0.88
idas
0.88
cloth
0.86
sleeves
0.86
Activations Density 0.008%