INDEX
    Explanations

    terms related to legal violations

    phrases indicating legal violations or infringements

    New Auto-Interp
    Negative Logits
    ritz
    -0.75
    onna
    -0.73
    ppa
    -0.72
    retty
    -0.69
     dwind
    -0.65
     thanking
    -0.64
    borgh
    -0.64
    achine
    -0.63
    agne
    -0.62
    bang
    -0.62
    POSITIVE LOGITS
    Ö¼
    0.91
     norms
    0.82
     NRS
    0.77
     instr
    0.76
     Contracts
    0.75
    orius
    0.75
    hibited
    0.73
     procedural
    0.72
     CLS
    0.70
    ãģį
    0.70
    Act Density 0.098%

    No Known Activations