INDEX
    Explanations

    phrases indicating someone committing or getting away with something, often negative

    phrases indicating evasion or getting away with actions

    New Auto-Interp
    Negative Logits
    stem
    -0.77
    wake
    -0.74
    wash
    -0.74
    agues
    -0.73
    outer
    -0.71
    link
    -0.68
    arta
    -0.68
    ships
    -0.64
    main
    -0.64
    wordpress
    -0.63
    POSITIVE LOGITS
     impunity
    0.94
     murder
    0.88
     manslaughter
    0.78
     murdering
    0.77
     exploiting
    0.76
     Murder
    0.75
     polygamy
    0.72
     exploitation
    0.70
     plunder
    0.67
     anything
    0.65
    Act Density 0.064%

    No Known Activations