INDEX
    Explanations

    mentions of fundamental concepts or beliefs

    references to principles, particularly in legal and ethical contexts

    New Auto-Interp
    Negative Logits
    ammers
    -0.71
    minster
    -0.70
    leneck
    -0.68
    ilation
    -0.67
    NetMessage
    -0.67
    quer
    -0.67
    eor
    -0.66
    essions
    -0.65
    dos
    -0.64
    hiba
    -0.64
    POSITIVE LOGITS
     principles
    1.01
     principle
    0.91
     guiding
    0.90
    ciples
    0.88
     underlying
    0.83
    ually
    0.82
     underpin
    0.81
     Principles
    0.80
     precept
    0.73
    cipled
    0.72
    Act Density 0.023%

    No Known Activations