INDEX
    Explanations

    mentions of fundamental beliefs or rules

    references to established principles or guidelines

    New Auto-Interp
    Negative Logits
    minster
    -0.74
    ilation
    -0.70
    sg
    -0.69
    leneck
    -0.68
    otte
    -0.66
    aughter
    -0.65
    olla
    -0.65
    eor
    -0.64
    reen
    -0.64
    ready
    -0.64
    POSITIVE LOGITS
     principles
    1.12
    ciples
    0.95
     guiding
    0.95
     principals
    0.91
     Principles
    0.91
     underpin
    0.88
     principle
    0.85
    ophical
    0.83
     underlying
    0.80
     precept
    0.77
    Act Density 0.018%

    No Known Activations