INDEX
Explanations
mentions of human rights abuses and government suppression
New Auto-Interp
Negative Logits
ArgsConstructor
-0.78
الرياضيه
-0.73
pitié
-0.71
Corinth
-0.65
שוליים
-0.65
IUrlHelper
-0.61
ództ
-0.61
conceding
-0.60
Aphrodite
-0.60
atoshi
-0.59
POSITIVE LOGITS
persecution
0.79
persecu
0.75
perse
0.71
persecuted
0.68
arrest
0.67
arrests
0.61
persec
0.59
censorship
0.56
ban
0.55
arbitrary
0.55
Activations Density 0.313%