INDEX
    Explanations

    AI ethics and bias

    New Auto-Interp
    Negative Logits
     jot
    -0.08
     aromatic
    -0.08
     singly
    -0.08
    -0.08
     Albums
    -0.08
     cactus
    -0.08
     Resort
    -0.08
     '.'
    -0.08
     Hotel
    -0.08
     percussion
    -0.07
    POSITIVE LOGITS
     biases
    0.20
    Bias
    0.18
    _bias
    0.18
     fairness
    0.18
     bias
    0.17
     biased
    0.17
    公平
    0.16
     Bias
    0.16
    bias
    0.16
     biais
    0.16
    Act Density 0.025%

    No Known Activations