INDEX
Explanations
phrases that emphasize social justice and advocacy
New Auto-Interp
Negative Logits
ollapse
-0.16
VERS
-0.15
Rank
-0.14
.annotate
-0.14
nee
-0.14
ë¨
-0.14
ollapsed
-0.14
åı£
-0.14
otel
-0.14
hape
-0.14
POSITIVE LOGITS
builders
0.15
æļ
0.15
arr
0.14
undy
0.14
Tru
0.14
fg
0.14
ili
0.14
894
0.14
art
0.14
licit
0.13
Activations Density 0.046%