INDEX
Explanations
phrases indicating someone committing or getting away with something, often negative
phrases indicating evasion or getting away with actions
New Auto-Interp
Negative Logits
stem
-0.77
wake
-0.74
wash
-0.74
agues
-0.73
outer
-0.71
link
-0.68
arta
-0.68
ships
-0.64
main
-0.64
wordpress
-0.63
POSITIVE LOGITS
impunity
0.94
murder
0.88
manslaughter
0.78
murdering
0.77
exploiting
0.76
Murder
0.75
polygamy
0.72
exploitation
0.70
plunder
0.67
anything
0.65
Activations Density 0.064%