INDEX
    Explanations

    phrases expressing moral outrage or disapproval regarding social issues

    New Auto-Interp
    Negative Logits
    ëŀĢ
    -0.14
     culpa
    -0.13
    EventArgs
    -0.13
     blas
    -0.13
    anes
    -0.13
     Ens
    -0.13
     Schwartz
    -0.13
     Gu
    -0.13
     Needle
    -0.13
     Natur
    -0.12
    POSITIVE LOGITS
    áli
    0.15
    anova
    0.15
    rosse
    0.14
    vero
    0.14
    اÙħÛĮ
    0.14
    ushman
    0.13
    iation
    0.13
    imo
    0.13
    wahl
    0.13
    礼
    0.13
    Act Density 0.102%

    No Known Activations