INDEX
Explanations
proper nouns related to political figures and entities
mentions of specific individuals, particularly women in leadership roles
New Auto-Interp
Negative Logits
FTWARE
-0.73
amaz
-0.70
ombat
-0.70
Sov
-0.67
ratulations
-0.65
gran
-0.65
uld
-0.64
Interstitial
-0.63
acters
-0.61
ilib
-0.61
POSITIVE LOGITS
clips
1.00
Mock
0.79
worthy
0.74
Reno
0.74
ials
0.72
Janet
0.71
eers
0.70
ridge
0.70
Carlson
0.66
bird
0.65
Activations Density 0.035%