INDEX
    Explanations

    gender roles and inequality

    New Auto-Interp
    Negative Logits
    Interface
    0.38
     tár
    0.38
    過程
    0.38
    量は
    0.38
    ",[],"
    0.37
     மற
    0.37
     فرد
    0.37
    வீன
    0.37
     hoeveel
    0.36
     reimb
    0.36
    POSITIVE LOGITS
     specificity
    0.94
     affiliation
    0.88
    specificity
    0.87
     differences
    0.86
    pecific
    0.76
     bias
    0.72
     preference
    0.71
    Specificity
    0.68
     specific
    0.67
     affiliations
    0.67
    Act Density 0.091%

    No Known Activations