INDEX
    Explanations

    references to legal violations

    New Auto-Interp
    Negative Logits
    mad
    -0.87
    arger
    -0.77
    rich
    -0.73
    anka
    -0.72
    abs
    -0.71
    rose
    -0.69
    opped
    -0.69
    azor
    -0.67
    iris
    -0.66
    venture
    -0.65
    POSITIVE LOGITS
     laws
    0.89
     Laws
    0.85
     statutes
    0.82
     confidentiality
    0.81
     violations
    0.80
     norms
    0.80
     curfew
    0.76
     unfocusedRange
    0.75
    laws
    0.75
     antitrust
    0.74
    Act Density 0.019%

    No Known Activations