INDEX
Explanations
phrases describing or questioning social and historical issues, particularly related to identity, race, and societal norms
references to historical and cultural contexts related to race and identity
New Auto-Interp
Negative Logits
DCS
-0.63
Oversight
-0.61
climbers
-0.60
backers
-0.56
Kickstarter
-0.56
PAX
-0.56
subreddits
-0.55
ACTIONS
-0.55
Hurricanes
-0.55
LCS
-0.54
POSITIVE LOGITS
whereas
0.76
_.
0.74
;}
0.72
\.
0.72
.''
0.71
['
0.70
</
0.69
ocre
0.68
^{0.68
âĪĴ
0.67
Activations Density 1.074%