INDEX
    Explanations

    words related to various forms of abuse and mistreatment

    New Auto-Interp
    Negative Logits
    ãĤ·ãĤ¢
    -0.16
    verse
    -0.14
    rey
    -0.14
    set
    -0.14
    ennon
    -0.14
    liga
    -0.14
    .Encoding
    -0.14
    çŃĴ
    -0.14
    تاÙĨ
    -0.14
    revision
    -0.14
    POSITIVE LOGITS
    fully
    0.15
     Spirits
    0.15
    733
    0.15
    amac
    0.14
     Reporting
    0.14
     Fletcher
    0.14
     Becker
    0.13
    /validation
    0.13
    (mark
    0.13
    .Formatting
    0.13
    Act Density 0.026%

    No Known Activations