INDEX
Explanations
discriminatory language related to legal actions
references to investments and financial opportunities related to community initiatives
New Auto-Interp
Negative Logits
pse
-0.86
sucker
-0.83
spitting
-0.81
grop
-0.78
poisoning
-0.77
imperson
-0.77
gamb
-0.75
civilisation
-0.72
mathemat
-0.72
desper
-0.71
POSITIVE LOGITS
###
1.52
SOURCE
1.47
Learn
1.45
About
1.44
Featured
1.43
Visit
1.40
Below
1.38
Explore
1.37
advertisement
1.32
Join
1.30
Activations Density 0.486%