INDEX
    Explanations

    statements related to legal or moral implications and consequences

    phrases indicating reasonable beliefs related to danger and injury

    New Auto-Interp
    Negative Logits
    DK
    -0.58
     Patreon
    -0.56
    ahime
    -0.56
    MK
    -0.55
    iets
    -0.55
    MSN
    -0.55
    Knight
    -0.54
     Jed
    -0.54
    Dialogue
    -0.54
    emonium
    -0.54
    POSITIVE LOGITS
    ).[
    0.76
    Downloadha
    0.73
    )).
    0.72
     harm
    0.69
    ").
    0.68
    )."
    0.67
     biological
    0.62
     or
    0.62
    azo
    0.61
     detriment
    0.61
    Act Density 2.096%

    No Known Activations