INDEX
    Explanations

    keywords related to legal terms and political figures

    phrases related to combating gender bias and promoting equality

    New Auto-Interp
    Negative Logits
     vulner
    -0.60
     rigging
    -0.60
     ..."
    -0.59
     respectively
    -0.59
     prec
    -0.56
     fame
    -0.55
     opposite
    -0.53
     polar
    -0.51
     persuasion
    -0.49
     decisive
    -0.49
    POSITIVE LOGITS
    ashtra
    0.70
    arius
    0.68
    arij
    0.66
    yna
    0.65
    yn
    0.64
     Profile
    0.64
    abus
    0.63
    ulous
    0.62
    azel
    0.62
    ibliography
    0.61
    Act Density 1.642%

    No Known Activations