INDEX
Explanations
phrases related to dressing up or wearing costumes
New Auto-Interp
Head Attr Weights
0:0.01
1:0.01
2:0.05
3:0.06
4:0.10
5:0.02
6:0.04
7:0.45
8:0.02
9:0.03
10:0.09
11:0.06
Negative Logits
owners
-1.64
Effects
-1.62
holding
-1.59
affected
-1.58
otten
-1.57
unanswered
-1.53
dissatisf
-1.52
occupied
-1.50
marks
-1.49
asta
-1.47
POSITIVE LOGITS
Bride
1.73
Jacket
1.67
Shirt
1.63
robes
1.60
Masquerade
1.57
underwear
1.53
costume
1.51
clothing
1.48
attire
1.48
dresses
1.48
Activations Density 0.024%