INDEX
    Explanations

    themes related to social justice and advocacy

    New Auto-Interp
    Negative Logits
    rak
    -0.16
    _SAN
    -0.15
    rees
    -0.15
    AndServe
    -0.15
    ogn
    -0.15
    ابÛĮ
    -0.14
    orn
    -0.14
     komp
    -0.14
    asma
    -0.14
    inne
    -0.14
    POSITIVE LOGITS
     with
    0.34
    with
    0.26
    	with
    0.25
     dengan
    0.24
     unfavor
    0.21
     avec
    0.21
     vỼi
    0.21
    swith
    0.20
     together
    0.20
    ewith
    0.19
    Act Density 0.086%

    No Known Activations