INDEX
    Explanations

    incidents of violence and crime

    New Auto-Interp
    Negative Logits
     же
    -0.17
     kå
    -0.15
    ourg
    -0.15
     artık
    -0.15
    avier
    -0.14
    imo
    -0.14
     prostituer
    -0.13
     optionally
    -0.13
    oise
    -0.13
    arkin
    -0.13
    POSITIVE LOGITS
     while
    0.36
     whilst
    0.29
    while
    0.29
     during
    0.27
     WHILE
    0.25
    _while
    0.25
     minutes
    0.24
     after
    0.24
     moments
    0.23
     While
    0.23
    Act Density 0.266%

    No Known Activations