INDEX
Explanations
themes related to societal expectations and the pressure to appear perfect or successful
New Auto-Interp
Negative Logits
orus
-0.16
lein
-0.15
artin
-0.14
arel
-0.14
warts
-0.14
κÎŃ
-0.14
POCH
-0.14
Ark
-0.14
rud
-0.14
kara
-0.14
POSITIVE LOGITS
Pressure
0.15
tả
0.15
HITE
0.14
pressure
0.14
pressure
0.14
.camel
0.14
пеÑĢек
0.14
Pressure
0.13
SKI
0.13
ç¡
0.13
Activations Density 0.128%