INDEX
Explanations
phrases or terms related to policies, reports, and social issues
indicators of social issues affecting marginalized groups
New Auto-Interp
Negative Logits
Catalyst
-0.73
Babel
-0.73
warp
-0.70
Zeit
-0.68
Tulsa
-0.68
Strawberry
-0.68
Rez
-0.67
CLR
-0.67
Rhodes
-0.66
antic
-0.65
POSITIVE LOGITS
who
1.52
selves
1.28
whose
1.09
their
1.02
owners
1.00
living
0.97
who
0.94
whom
0.94
elected
0.93
ï¸ı
0.92
Activations Density 0.296%