INDEX
Explanations
mentions of political or governmental policies
references to government policies
New Auto-Interp
Negative Logits
issan
-0.84
ITNESS
-0.82
lighting
-0.76
Ort
-0.74
Sabha
-0.73
sembly
-0.73
Rocket
-0.73
CLASSIFIED
-0.72
athan
-0.71
Sud
-0.70
POSITIVE LOGITS
policies
1.32
policy
1.06
Policies
1.05
prescriptions
0.99
stances
0.90
policy
0.84
stance
0.83
preferences
0.82
olicy
0.81
directives
0.81
Activations Density 0.014%