INDEX
    Explanations

    words related to torture and physical abuse

    New Auto-Interp
    Negative Logits
     donor
    -0.97
    earance
    -0.82
    BY
    -0.79
     donation
    -0.78
     iod
    -0.77
    ership
    -0.75
     Prospect
    -0.75
     magnification
    -0.74
    Ü
    -0.73
    ACP
    -0.72
    POSITIVE LOGITS
    urous
    1.37
    oise
    1.35
    illas
    1.11
    anamo
    0.97
    ured
    0.96
    aste
    0.95
    teenth
    0.95
    uous
    0.94
    uses
    0.92
    ificate
    0.92
    Act Density 0.796%

    No Known Activations