INDEX
Explanations
phrases and terms associated with economic inequality and social class distinctions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.10
3:0.06
4:0.28
5:0.03
6:0.11
7:0.16
8:0.03
9:0.04
10:0.06
11:0.05
Negative Logits
][/
-1.55
ogun
-1.45
Journalists
-1.45
irm
-1.44
ipolar
-1.40
cius
-1.38
rus
-1.37
itaire
-1.37
EDIT
-1.36
hement
-1.36
POSITIVE LOGITS
larg
1.67
notions
1.57
excess
1.55
stereotypes
1.55
metaphors
1.54
concepts
1.52
tricks
1.50
wholes
1.50
scraps
1.49
principles
1.48
Activations Density 0.003%