INDEX
Explanations
references to the ages and demographics of individuals, particularly minors and young people
New Auto-Interp
Negative Logits
underage
-0.21
teenage
-0.20
older
-0.20
Older
-0.19
younger
-0.19
elderly
-0.19
older
-0.18
Ages
-0.17
ageing
-0.17
äl
-0.17
POSITIVE LOGITS
who
0.21
-fashioned
0.18
/new
0.17
whose
0.17
-old
0.16
who
0.16
fashioned
0.15
emy
0.15
Rosenberg
0.15
virgin
0.14
Activations Density 0.030%