INDEX
Explanations
references to young individuals or youth-related topics
New Auto-Interp
Negative Logits
young
-0.79
young
-0.74
younger
-0.68
jeune
-0.67
Young
-0.66
giovane
-0.65
Young
-0.65
autorytatywna
-0.63
giovani
-0.60
YOUNG
-0.59
POSITIVE LOGITS
blood
0.81
sters
0.77
stown
0.66
ish
0.64
ster
0.64
lings
0.63
STERS
0.61
bucks
0.60
adulthood
0.60
minds
0.59
Activations Density 0.122%