INDEX
    Explanations

    references to public opinion and democratic processes

    New Auto-Interp
    Negative Logits
     dign
    -0.17
    avel
    -0.14
    iri
    -0.14
    лагод
    -0.13
    733
    -0.13
     iddi
    -0.13
    [assembly
    -0.13
    ussy
    -0.13
    аÑĤок
    -0.13
     Multiply
    -0.13
    POSITIVE LOGITS
     support
    0.39
     popular
    0.38
     public
    0.37
     opinion
    0.32
    popular
    0.31
     sentiment
    0.30
     Support
    0.29
     sentiments
    0.28
    public
    0.27
    support
    0.27
    Act Density 0.273%

    No Known Activations