INDEX
    Explanations

    mentions of values and principles

    arguments related to morality and ethical considerations in societal frameworks

    New Auto-Interp
    Negative Logits
    lbs
    -0.72
    heast
    -0.65
     Pwr
    -0.65
    NES
    -0.63
    UFC
    -0.61
     Sprint
    -0.60
    tips
    -0.59
     VIP
    -0.59
    laun
    -0.59
     Emergency
    -0.59
    POSITIVE LOGITS
     epist
    1.15
     philosophers
    0.97
     presupp
    0.96
     insofar
    0.95
     empirical
    0.92
     normative
    0.91
     implicitly
    0.89
     subjective
    0.86
     intrinsically
    0.86
     empir
    0.86
    Act Density 2.124%

    No Known Activations