INDEX
    Explanations

    phrases related to challenging or reviewing actions by authority figures

    instances of the word "by" indicating actions or authorship

    New Auto-Interp
    Negative Logits
     resil
    -0.79
    ãĤ¦ãĤ¹
    -0.78
     "$:/
    -0.75
    redits
    -0.70
    stakes
    -0.68
    tenance
    -0.67
    adal
    -0.66
    meat
    -0.65
    URN
    -0.64
    qqa
    -0.64
    POSITIVE LOGITS
     virtue
    1.03
    laws
    0.96
    products
    0.95
    gone
    0.85
    product
    0.78
     omission
    0.77
     proxy
    0.72
     policymakers
    0.69
     whistleblowers
    0.69
     politicians
    0.69
    Act Density 0.122%

    No Known Activations