INDEX
Explanations
references to statistical data and studies related to social issues
New Auto-Interp
Negative Logits
æĸ¼
-0.15
Wheeler
-0.14
ä¸ĺ
-0.14
emain
-0.14
anto
-0.14
arris
-0.14
iders
-0.14
-0.14
abo
-0.13
Gould
-0.13
POSITIVE LOGITS
ży
0.15
ãĥªãĥ¼ãĤº
0.15
otto
0.14
olt
0.14
iban
0.14
/or
0.14
tes
0.14
ì͍
0.14
fo
0.14
ffee
0.13
Activations Density 0.026%