INDEX
    Explanations

    words related to harmful actions or events and their consequences

    phrases related to crime and its consequences

    New Auto-Interp
    Negative Logits
    ================================
    -0.61
     Lets
    -0.59
     Majesty
    -0.59
     advoc
    -0.57
    ¯¯¯¯¯¯¯¯
    -0.57
    idth
    -0.55
     Whilst
    -0.53
    BuyableInstoreAndOnline
    -0.53
     Bachelor
    -0.52
     Tuls
    -0.52
    POSITIVE LOGITS
     afterward
    1.28
     afterwards
    1.01
     elsewhere
    0.98
     later
    0.93
     thereafter
    0.82
     earlier
    0.79
     abroad
    0.79
     nearby
    0.78
    Enlarge
    0.77
     beforehand
    0.71
    Act Density 0.617%

    No Known Activations