INDEX
Explanations
concepts related to social inequality and class disparities
New Auto-Interp
Negative Logits
OKIE
-0.15
vala
-0.14
Enumerator
-0.14
ardi
-0.14
uela
-0.14
_CSV
-0.14
ogui
-0.14
obo
-0.14
HEME
-0.14
eba
-0.13
POSITIVE LOGITS
handful
0.36
few
0.32
select
0.29
few
0.27
Few
0.25
minority
0.25
Few
0.24
elite
0.23
elite
0.22
fewer
0.22
Activations Density 0.169%