INDEX
    Explanations

    principles and values within text, particularly those related to principles of governance, moral principles, and core values

    references to foundational principles and values

    New Auto-Interp
    Negative Logits
     glitches
    -0.70
    tg
    -0.67
     guesses
    -0.67
     speculate
    -0.66
     rumors
    -0.66
    ammers
    -0.66
     queues
    -0.65
    ctic
    -0.63
    sites
    -0.63
    events
    -0.63
    POSITIVE LOGITS
     enshr
    1.49
     principles
    1.24
     embodied
    1.22
     underpin
    1.16
     guiding
    1.13
     Principles
    1.13
     articulated
    1.08
     principle
    1.07
     upheld
    1.06
     precept
    1.05
    Act Density 0.233%

    No Known Activations