INDEX
Explanations
phrases related to violating laws, rules, or standards
references to violations of laws and constitutional rights
New Auto-Interp
Negative Logits
iris
-0.73
bang
-0.72
enhagen
-0.71
hin
-0.70
rising
-0.68
naissance
-0.67
uilding
-0.62
rowth
-0.61
ppel
-0.61
compares
-0.60
POSITIVE LOGITS
confidentiality
1.32
privacy
1.18
laws
1.11
prohibitions
1.11
rights
1.07
obligations
1.05
norms
1.05
principles
1.03
tenets
1.02
neutrality
1.00
Activations Density 0.159%