INDEX
Explanations
mentions of women and their experiences in society, particularly focusing on themes of empowerment and societal expectations
New Auto-Interp
Head Attr Weights
0:0.16
1:0.15
2:0.09
3:0.08
4:0.04
5:0.03
6:0.09
7:0.09
8:0.02
9:0.05
10:0.11
11:0.04
Negative Logits
Jae
-3.05
helic
-2.95
Fiber
-2.88
ulhu
-2.83
Sz
-2.83
htt
-2.74
akeru
-2.70
Aether
-2.69
Secure
-2.67
Ender
-2.67
POSITIVE LOGITS
Madonna
8.10
Mad
4.08
McCartney
3.65
Albert
3.54
Joan
3.45
Queen
3.40
Mary
3.39
MAD
3.30
Gaga
3.30
Prince
3.22
Activations Density 0.002%