INDEX
    Explanations

    concepts related to innocence and personal relationships

    harm or safety of innocents

    New Auto-Interp
    Negative Logits
     arquitetura
    -0.36
     pós
    -0.31
     pula
    -0.31
    AutoresizingMask
    -0.31
    handeling
    -0.30
     Dicapai
    -0.29
    ElementException
    -0.29
     défaut
    -0.29
     useStyles
    -0.29
     nucléaire
    -0.29
    POSITIVE LOGITS
    ValueStyle
    0.83
     unprotected
    0.60
     safety
    0.59
     casualties
    0.58
    safety
    0.57
     nonUne
    0.54
    الحياه
    0.53
     cjs
    0.51
    Innoc
    0.51
    Safety
    0.51
    Act Density 0.134%

    No Known Activations