INDEX
    Explanations

    instances of coercive and abusive behavior in relationships

    New Auto-Interp
    Negative Logits
    selves
    -0.82
     collectively
    -0.72
    atures
    -0.68
    £ı
    -0.67
     miscar
    -0.67
    taboola
    -0.65
     husband
    -0.64
     Founding
    -0.64
    result
    -0.64
    etheless
    -0.63
    POSITIVE LOGITS
     himself
    1.32
     his
    0.99
     Himself
    0.87
    Jr
    0.73
     me
    0.71
     girlfriend
    0.69
     erection
    0.69
     raping
    0.69
     Jr
    0.69
    sed
    0.66
    Act Density 0.406%

    No Known Activations