INDEX
    Explanations

    phrases related to empowerment

    terms related to empowerment and support for marginalized groups

    New Auto-Interp
    Negative Logits
     Goo
    -0.73
    patch
    -0.68
     Canaver
    -0.67
    ago
    -0.67
    ×IJ
    -0.66
    eda
    -0.66
     Uniform
    -0.65
    owitz
    -0.63
    hound
    -0.63
    hiba
    -0.63
    POSITIVE LOGITS
    ments
    1.01
    Reviewer
    0.90
    ment
    0.85
    MENTS
    0.77
    iences
    0.77
    mentation
    0.75
    irlf
    0.73
    ittees
    0.72
     empower
    0.71
    FUL
    0.71
    Act Density 0.038%

    No Known Activations