INDEX
Explanations
occurrences of pronouns and inclusive language
New Auto-Interp
Negative Logits
kre
-0.16
alfa
-0.16
Intelligence
-0.16
FLOW
-0.15
unc
-0.15
ague
-0.14
-0.14
Equ
-0.14
Progress
-0.14
g
-0.14
POSITIVE LOGITS
دÙģ
0.17
ÅĻeh
0.17
oldem
0.15
ÛĮØ·
0.15
ÐĿÑĥ
0.15
miner
0.15
watt
0.14
<path
0.14
ãĥ³ãĥĨ
0.14
Dispatch
0.14
Activations Density 0.001%