INDEX
    Explanations

    references to societal issues and movements, particularly those related to justice and equality

    New Auto-Interp
    Negative Logits
    ança
    -0.15
    impl
    -0.15
     ged
    -0.14
    oris
    -0.14
    OS
    -0.14
     Si
    -0.14
     vere
    -0.14
     sh
    -0.14
     Alternate
    -0.14
     AN
    -0.14
    POSITIVE LOGITS
    berman
    0.19
     nebu
    0.16
    izzo
    0.16
     etc
    0.16
    etc
    0.15
    oÄŁ
    0.15
     MÄĽst
    0.15
    ij¸
    0.15
    اتÙĩ
    0.15
    azen
    0.14
    Act Density 0.674%

    No Known Activations