INDEX
Explanations
pronouns, particularly those referring to individuals
New Auto-Interp
Head Attr Weights
0:0.03
1:0.05
2:0.06
3:0.05
4:0.13
5:0.04
6:0.06
7:0.10
8:0.18
9:0.04
10:0.10
11:0.12
Negative Logits
Chance
-1.84
ガ
-1.63
Assignment
-1.54
intendo
-1.53
Males
-1.53
Household
-1.51
Mate
-1.51
Cause
-1.47
resents
-1.46
Cameroon
-1.45
POSITIVE LOGITS
ived
1.98
odynam
1.76
rified
1.69
price
1.66
ucl
1.61
iblical
1.56
erest
1.51
priced
1.50
leap
1.47
tions
1.47
Activations Density 0.002%