INDEX
    Explanations

    sentences discussing societal challenges and responsibilities

    New Auto-Interp
    Negative Logits
    istrovstvÃŃ
    -0.17
    isex
    -0.16
    ereotype
    -0.16
    unner
    -0.16
    ãĥ¼ãĥł
    -0.15
     buflen
    -0.15
     Ryder
    -0.15
    unal
    -0.15
    ervoir
    -0.15
    ãĥ³ãĤ°ãĥ«
    -0.15
    POSITIVE LOGITS
    >({
    0.15
    fare
    0.15
     Gar
    0.14
    .ms
    0.14
     recipro
    0.14
     Garrett
    0.14
    .global
    0.14
     Dent
    0.14
    opp
    0.13
    mun
    0.13
    Act Density 1.094%

    No Known Activations