INDEX
    Explanations

    This neuron detects references to diversity and inclusion.

    New Auto-Interp
    Negative Logits
    KEY
    -0.07
    aksi
    -0.06
     Steps
    -0.06
    Ya
    -0.06
    subjects
    -0.06
     yak
    -0.06
    -digit
    -0.06
    .get
    -0.06
     faces
    -0.06
     Euler
    -0.06
    POSITIVE LOGITS
     diversity
    0.09
     Diversity
    0.08
     tvoř
    0.07
     Compet
    0.07
     represented
    0.06
    represented
    0.06
     meilleure
    0.06
     Interpreter
    0.06
    erialized
    0.06
     Аф
    0.06
    Act Density 0.008%

    No Known Activations