INDEX
    Explanations

    phrases related to incidents of violence and danger

    New Auto-Interp
    Negative Logits
     smiles
    -0.14
     smile
    -0.14
    Tomorrow
    -0.13
    ologue
    -0.13
     Sender
    -0.13
    umpt
    -0.13
    ètre
    -0.13
     smiling
    -0.13
    .eng
    -0.13
    _iff
    -0.13
    POSITIVE LOGITS
     witnessing
    0.21
    heard
    0.21
     dash
    0.20
     heard
    0.20
     intervene
    0.19
     intervened
    0.19
     intervention
    0.18
     hero
    0.18
     hearing
    0.18
     interven
    0.18
    Act Density 0.130%

    No Known Activations