INDEX
Explanations
references to women in various contexts, particularly in age-related or relational situations
New Auto-Interp
Head Attr Weights
0:0.08
1:0.01
2:0.15
3:0.06
4:0.12
5:0.03
6:0.08
7:0.02
8:0.15
9:0.05
10:0.10
11:0.10
Negative Logits
Reloaded
-1.31
eries
-1.22
smoot
-1.20
Ware
-1.18
Rules
-1.18
Hazard
-1.17
「
-1.16
Limit
-1.16
Deposit
-1.15
where
-1.14
POSITIVE LOGITS
kinson
1.55
dinand
1.54
alike
1.42
rolet
1.41
gunman
1.41
Kazakh
1.37
ullah
1.36
olulu
1.34
duino
1.33
azeera
1.33
Activations Density 0.020%