INDEX
    Explanations

    terms indicating negative impacts or harmful effects

    detrimental or detriment

    New Auto-Interp
    Negative Logits
    kindness
    -0.56
    timacy
    -0.55
    doma
    -0.55
     Dapper
    -0.54
    Harsh
    -0.54
    agency
    -0.54
     Kindness
    -0.54
    Hardy
    -0.54
    hybrid
    -0.53
    CAPE
    -0.53
    POSITIVE LOGITS
     detriment
    0.59
     nahilalakip
    0.53
     jälkeen
    0.46
     detrimental
    0.45
     relegated
    0.45
    enschappelijke
    0.45
     decoración
    0.42
     inmediatamente
    0.42
    ésult
    0.42
    Spoljašnje
    0.41
    Act Density 0.011%

    No Known Activations