INDEX
Explanations
references to involvement in political, social, or community activities
New Auto-Interp
Head Attr Weights
0:0.02
1:0.00
2:0.10
3:0.06
4:0.11
5:0.03
6:0.05
7:0.35
8:0.02
9:0.04
10:0.10
11:0.06
Negative Logits
iture
-1.86
items
-1.77
marked
-1.63
ishable
-1.63
pection
-1.60
pict
-1.59
iencies
-1.58
pex
-1.52
evidence
-1.51
phe
-1.50
POSITIVE LOGITS
bandwagon
2.21
enthusiastically
1.78
fray
1.76
corpor
1.75
chorus
1.68
ferv
1.59
passionately
1.58
coh
1.55
participate
1.54
psy
1.51
Activations Density 0.003%