INDEX
Explanations
statements about women's social behavior and public perception
New Auto-Interp
Head Attr Weights
0:0.07
1:0.03
2:0.06
3:0.04
4:0.06
5:0.04
6:0.21
7:0.05
8:0.05
9:0.28
10:0.02
11:0.03
Negative Logits
ェ
-4.07
Fla
-3.55
shroud
-3.44
Claus
-3.44
RG
-3.43
MpServer
-3.42
Fal
-3.35
leagues
-3.35
Toro
-3.30
Boyd
-3.27
POSITIVE LOGITS
Amy
9.32
Amy
9.14
amy
7.33
amy
5.59
ALS
5.09
AMY
4.68
Amtrak
4.28
Tammy
4.20
Amelia
4.00
iam
3.98
Activations Density 0.002%