INDEX
    Explanations

    terms associated with physical damage or impairment

    New Auto-Interp
    Negative Logits
    kir
    -0.16
    curacy
    -0.16
    ksi
    -0.15
    kses
    -0.15
    zug
    -0.14
    enza
    -0.14
    ughs
    -0.14
    окÑģи
    -0.14
    riter
    -0.14
    æľºåħ³
    -0.14
    POSITIVE LOGITS
    /problem
    0.17
    /null
    0.17
    cies
    0.16
    /exp
    0.16
     humanity
    0.15
     Sie
    0.14
    enuity
    0.14
    /false
    0.14
    roe
    0.14
     Humanity
    0.14
    Act Density 0.101%

    No Known Activations