INDEX
    Explanations

    references to diversity and its various dimensions, including cultural, ideological, and biological aspects

    New Auto-Interp
    Head Attr Weights
    0:0.01
    1:0.01
    2:0.09
    3:0.05
    4:0.08
    5:0.02
    6:0.06
    7:0.40
    8:0.02
    9:0.02
    10:0.10
    11:0.10
    Negative Logits
    urat
    -1.47
    pir
    -1.45
    aunder
    -1.40
    money
    -1.38
    ¢
    -1.33
    monitor
    -1.33
    WARD
    -1.31
    raz
    -1.31
    -1.31
    grand
    -1.30
    POSITIVE LOGITS
     sexes
    1.83
     genders
    1.82
     demographics
    1.66
     opinion
    1.63
     opinions
    1.54
     geographically
    1.51
     Diversity
    1.51
     individuality
    1.51
    erning
    1.46
     personality
    1.43
    Act Density 0.009%

    No Known Activations