INDEX
    Explanations

    references to diversity in various contexts

    mentions of diversity or varied groups of people/things

    New Auto-Interp
    Negative Logits
    çͰ
    -0.74
    rol
    -0.74
    WARD
    -0.73
    FORE
    -0.73
    ENA
    -0.71
    WAR
    -0.68
    Ob
    -0.66
     clicked
    -0.65
    Removed
    -0.63
    rollers
    -0.63
    POSITIVE LOGITS
    ively
    0.99
    ortment
    0.92
    mble
    0.89
    iveness
    0.89
     assemb
    0.88
    iated
    0.86
     perspectives
    0.86
     genders
    0.85
     avenues
    0.84
     viewpoints
    0.83
    Act Density 0.027%

    No Known Activations