INDEX
Explanations
mentions of specific groups of people or organizations
references to social justice and advocacy for marginalized groups
New Auto-Interp
Negative Logits
interstitial
-0.60
occasional
-0.57
guitarist
-0.57
quizz
-0.56
notable
-0.55
frequent
-0.53
booted
-0.52
arthed
-0.52
dubious
-0.52
antibodies
-0.51
POSITIVE LOGITS
..."
1.59
,''
1.36
â̦"
1.34
[/
1.33
[/
1.33
*/
1.31
.,"
1.29
!!"
1.29
,"
1.26
..."
1.24
Activations Density 1.078%