INDEX
    Explanations

    statements related to legal authority and human rights issues

    New Auto-Interp
    Negative Logits
    achsen
    -0.14
    Crime
    -0.14
    çĬ¯
    -0.14
     Murder
    -0.14
    罪
    -0.14
     incap
    -0.13
     Rebellion
    -0.13
     Crime
    -0.13
    628
    -0.13
    utt
    -0.13
    POSITIVE LOGITS
     arbitrary
    0.27
     Kafka
    0.26
     Dra
    0.26
     retro
    0.24
     Arbitrary
    0.24
     rushed
    0.23
    dra
    0.22
     discriminatory
    0.22
     kafka
    0.21
     subjective
    0.21
    Act Density 0.400%

    No Known Activations