INDEX
Explanations
terms related to unethical or illegal activities, specifically corruption
instances and mentions of corruption
New Auto-Interp
Negative Logits
imus
-0.89
nee
-0.83
¯¯¯¯
-0.76
ofi
-0.75
AY
-0.73
amins
-0.72
lee
-0.71
estead
-0.70
emade
-0.70
ocard
-0.69
POSITIVE LOGITS
corruption
1.11
scandals
1.02
corrupt
0.89
Corruption
0.88
corrupted
0.86
scandal
0.86
bribery
0.81
disproportion
0.81
penalty
0.79
dealings
0.78
Activations Density 0.014%