INDEX
    Explanations

    evidence related to sexual harassment allegations and workplace misconduct

    New Auto-Interp
    Negative Logits
    ↵↵
    -0.77
     &___
    -0.61
    )."
    -0.61
    SharedDtor
    -0.57
    )"
    -0.56
     дописавши
    -0.55
     surla
    -0.55
     سكانية
    -0.54
    "]))
    -0.53
    >()
    -0.52
    POSITIVE LOGITS
    2.85
    1.56
    1.33
    _
    
    1.28
    .
    
    1.28
    /
    
    1.25
    ?
    
    1.25
    !
    
    1.23
    []
    
    1.21
    ,
    
    1.17
    Act Density 0.172%

    No Known Activations