INDEX
Explanations
references to inclusivity and universality
New Auto-Interp
Negative Logits
kle
-0.16
Ì
-0.16
relude
-0.15
eral
-0.14
elia
-0.14
ysi
-0.14
lz
-0.13
uard
-0.13
609
-0.13
lü
-0.13
POSITIVE LOGITS
sorts
0.17
walk
0.17
/Dk
0.16
sexes
0.16
ninger
0.16
seasons
0.16
genders
0.15
286
0.15
createForm
0.15
quanh
0.15
Activations Density 0.128%