INDEX
Explanations
phrases related to clothing items, specifically shorts
references to shorts and related clothing items
New Auto-Interp
Negative Logits
rian
-0.79
Reviewed
-0.75
VERTIS
-0.72
upon
-0.70
————————————————
-0.69
rians
-0.69
Reviewer
-0.67
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
-0.65
Aud
-0.64
APH
-0.64
POSITIVE LOGITS
shorts
1.57
hirt
1.08
trousers
1.05
pants
0.99
leeve
0.96
jeans
0.95
socks
0.87
underwear
0.87
Shirt
0.86
sleeves
0.82
Activations Density 0.005%