INDEX
    Explanations

    references to injuries and damage

    New Auto-Interp
    Negative Logits
    (||
    -0.16
    irth
    -0.16
    ousse
    -0.15
    IDL
    -0.15
    ropa
    -0.15
    rove
    -0.15
    .synthetic
    -0.15
    ÑĢеÑĪ
    -0.15
    uzzy
    -0.14
    æ¹¾
    -0.14
    POSITIVE LOGITS
     broken
    0.33
    broken
    0.31
    Broken
    0.29
     Broken
    0.28
     broke
    0.27
    -hearted
    0.23
     breaks
    0.23
     vá»
    0.23
     shattered
    0.23
     apart
    0.22
    Act Density 0.046%

    No Known Activations