INDEX
    Explanations

    phrases related to self-awareness and acknowledgment of societal issues

    New Auto-Interp
    Negative Logits
    unj
    -0.15
    eniable
    -0.14
    iber
    -0.14
    aland
    -0.14
    ungeons
    -0.14
    iad
    -0.13
     lon
    -0.13
     Tome
    -0.13
    land
    -0.13
    zm
    -0.13
    POSITIVE LOGITS
    istrovstvÃŃ
    0.16
    etas
    0.15
    holm
    0.14
     ë°Ķë¡ľ
    0.14
    å¸ĸ
    0.14
    ysz
    0.14
    rab
    0.14
    iew
    0.14
    reste
    0.13
    punk
    0.13
    Act Density 0.147%

    No Known Activations