INDEX
    Explanations

    phrases that describe negative consequences or impacts of various actions or events

    New Auto-Interp
    Negative Logits
    دÙĬØ«
    -0.15
    endid
    -0.14
    icina
    -0.14
    NSBundle
    -0.14
     ÑĩÑĤобÑĭ
    -0.14
    onian
    -0.14
     inorder
    -0.13
    ÙĨداÙĨ
    -0.13
    åĭĻ
    -0.13
    šem
    -0.13
    POSITIVE LOGITS
     both
    0.28
     not
    0.25
     upon
    0.23
    both
    0.22
    /effects
    0.20
     felt
    0.19
     BOTH
    0.19
    felt
    0.18
     nejen
    0.18
     lives
    0.17
    Act Density 0.161%

    No Known Activations