INDEX
Explanations
references to social class distinctions, particularly focusing on the middle class
New Auto-Interp
Negative Logits
radi
-0.17
sun
-0.17
Ïĩε
-0.16
epad
-0.16
eny
-0.15
tsky
-0.15
survey
-0.15
yy
-0.15
antan
-0.15
SIDE
-0.15
POSITIVE LOGITS
-aged
0.30
sex
0.28
weight
0.26
bury
0.25
aged
0.25
Ages
0.23
ton
0.23
aged
0.23
eastern
0.23
SEX
0.22
Activations Density 0.017%