INDEX
    Explanations

    nationalities or demographics of different groups of people

    mentions of organizations or groups, particularly in a formal context

    New Auto-Interp
    Negative Logits
     restoration
    -0.73
     reversible
    -0.71
    fixes
    -0.70
    onement
    -0.68
     irreversible
    -0.68
    iversary
    -0.68
     Cancel
    -0.67
    ettel
    -0.66
     postp
    -0.66
     restoring
    -0.65
    POSITIVE LOGITS
    average
    0.88
     averages
    0.81
     populous
    0.80
     average
    0.79
     diversity
    0.77
    Average
    0.76
     dwar
    0.76
     diverse
    0.76
     disproportionately
    0.76
     unaff
    0.75
    Act Density 0.927%

    No Known Activations