INDEX
Explanations
references to diversity
references to diversity in various contexts
New Auto-Interp
Negative Logits
ENA
-0.86
amina
-0.85
ש
-0.73
ving
-0.70
DA
-0.69
ERSON
-0.67
mentioned
-0.67
hiba
-0.67
cise
-0.66
ny
-0.65
POSITIVE LOGITS
Diversity
0.98
diversity
0.96
iveness
0.87
yip
0.84
ensical
0.76
atility
0.74
ortment
0.73
llor
0.73
ively
0.72
icultural
0.71
Activations Density 0.016%