INDEX
    Explanations

    phrases related to social equality and injustice

    New Auto-Interp
    Negative Logits
    ereotype
    -0.15
    HOOK
    -0.15
    hook
    -0.14
    ernen
    -0.14
     bells
    -0.14
    hog
    -0.14
    prt
    -0.14
    aldi
    -0.13
    Tracker
    -0.13
    uggy
    -0.13
    POSITIVE LOGITS
     nou
    0.17
    anz
    0.14
    andest
    0.14
     Nielsen
    0.14
    odon
    0.13
     driving
    0.13
    istrovstvÃŃ
    0.13
    inton
    0.13
     instrumental
    0.13
     Mell
    0.13
    Act Density 0.019%

    No Known Activations