INDEX
    Explanations

    information related to updates, cooperation, commitments, and support

    phrases indicating commitment to transparency and accountability

    New Auto-Interp
    Negative Logits
    Untitled
    -0.79
    soDeliveryDate
    -0.67
    ?),
    -0.66
    opter
    -0.65
    ?).
    -0.63
     forgot
    -0.63
     dunno
    -0.62
    guy
    -0.62
    perture
    -0.62
    irement
    -0.61
    POSITIVE LOGITS
     ourselves
    1.05
     vigorously
    0.97
     vigilant
    0.90
     robust
    0.85
     vigil
    0.83
     vigorous
    0.81
     diligent
    0.81
     strive
    0.79
     our
    0.78
     principled
    0.78
    Act Density 0.529%

    No Known Activations