INDEX
    Explanations

    terms related to legal and social issues involving actions and behaviors

    phrases related to harassment and assault

    New Auto-Interp
    Negative Logits
    [/
    -0.64
    ortium
    -0.62
    izen
    -0.59
    */(
    -0.58
    ynes
    -0.57
    ozy
    -0.57
     Lanka
    -0.56
    Reviewer
    -0.55
    ]).
    -0.55
    actionDate
    -0.54
    POSITIVE LOGITS
    their
    1.40
     their
    1.23
     theirs
    1.18
     THEIR
    1.13
    Their
    1.10
     they
    1.05
    they
    1.01
    They
    0.99
     THEY
    0.96
     Their
    0.93
    Act Density 1.777%

    No Known Activations