INDEX
    Explanations

    references to reform, particularly in economic and policy contexts

    New Auto-Interp
    Negative Logits
    ially
    -0.18
    kest
    -0.18
    emas
    -0.16
    hood
    -0.16
    oub
    -0.15
    IAL
    -0.15
     fines
    -0.14
     formal
    -0.14
     formally
    -0.14
    chy
    -0.14
    POSITIVE LOGITS
    atted
    0.28
    ative
    0.27
    ulated
    0.23
    ulate
    0.21
    ulation
    0.20
    atories
    0.20
    ers
    0.19
    ulating
    0.19
    idable
    0.19
    /update
    0.17
    Act Density 0.014%

    No Known Activations