INDEX
Explanations
references to clothing items, particularly shirts
references to shirts and shirt-related items
New Auto-Interp
Negative Logits
sys
-0.71
ths
-0.70
SPONSORED
-0.67
ruciating
-0.65
ingred
-0.63
Galile
-0.62
fert
-0.62
ydia
-0.61
ITNESS
-0.61
OTOS
-0.60
POSITIVE LOGITS
leeve
1.20
hirt
1.20
shirts
1.15
sleeve
1.13
sleeves
1.13
shirt
1.09
shirt
1.07
worn
1.02
shirts
1.01
collar
0.97
Activations Density 0.044%