INDEX
Explanations
mentions of diversity and related concepts
references to diversity in various contexts
New Auto-Interp
Negative Logits
ENA
-0.90
amina
-0.84
CPC
-0.75
ש
-0.72
RL
-0.70
mentioned
-0.69
ATA
-0.69
hiba
-0.67
CHA
-0.66
nington
-0.65
POSITIVE LOGITS
Diversity
1.01
iveness
0.95
diversity
0.86
genders
0.75
ively
0.75
perspectives
0.75
ethnic
0.74
emale
0.73
yip
0.71
itarian
0.70
Activations Density 0.036%