INDEX
    Explanations

    words related to social justice and community support initiatives

    New Auto-Interp
    Negative Logits
    sworth
    -0.18
       
    -0.18
    li
    -0.18
    ning
    -0.17
    lo
    -0.17
    ìĿĦ
    -0.16
    ry
    -0.16
    ra
    -0.16
    liness
    -0.16
    Ìĥ
    -0.16
    POSITIVE LOGITS
    ez
    0.16
    ύ
    0.15
    AGE
    0.15
    -minded
    0.15
    -looking
    0.14
    ALLY
    0.14
    ehler
    0.14
    y
    0.14
    iative
    0.13
    element
    0.13
    Act Density 0.151%

    No Known Activations