INDEX
    Explanations

    references to institutional frameworks and regulations affecting individual rights and responsibilities

    New Auto-Interp
    Negative Logits
     â̦↵↵
    -0.19
     Âł
    -0.19
     â̦↵
    -0.17
     :↵
    -0.17
     
    -0.15
    -0.15
     :↵↵
    -0.15
     â̦
    -0.15
     â̦.
    -0.14
    -0.13
    POSITIVE LOGITS
     \↵
    0.49
    \↵
    0.48
    ,\↵
    0.36
     "\↵
    0.29
    "\↵
    0.29
    ãĢģ↵
    0.28
    "+↵
    0.28
     "+↵
    0.27
     \č↵
    0.27
    (↵
    0.26
    Act Density 8.944%

    No Known Activations