INDEX
    Explanations

    statements expressing strong opinions or beliefs

    references to rights and the importance of equal treatment for all individuals

    New Auto-Interp
    Negative Logits
    untled
    -0.70
    catentry
    -0.69
     Reported
    -0.68
     unexpectedly
    -0.66
     premature
    -0.65
    iatus
    -0.64
     Prompt
    -0.63
    senal
    -0.63
     anecd
    -0.63
    MpServer
    -0.63
    POSITIVE LOGITS
     cannot
    0.94
    verning
    0.93
     therefore
    0.89
     belongs
    0.87
     shouldn
    0.86
     obey
    0.82
     belong
    0.82
    respective
    0.80
     uphold
    0.79
     arrog
    0.77
    Act Density 0.644%

    No Known Activations