INDEX
    Explanations

    information related to instances of violence, crime, and legal matters

    New Auto-Interp
    Negative Logits
    pires
    -0.82
    itates
    -0.55
     ceases
    -0.55
     sleeps
    -0.54
     relies
    -0.52
    itiz
    -0.51
     likes
    -0.51
     guiIcon
    -0.50
    bara
    -0.49
     grows
    -0.49
    POSITIVE LOGITS
     respectively
    1.81
     apiece
    1.40
     respective
    0.96
     themselves
    0.93
     together
    0.88
     collectively
    0.79
    *.
    0.76
     jointly
    0.74
     whereas
    0.72
    .
    0.71
    Act Density 0.557%

    No Known Activations