INDEX
    Explanations

    safety and security-related terms

    terms related to security threats and illegal activities

    New Auto-Interp
    Negative Logits
     sums
    -0.56
     CONTR
    -0.56
     Factors
    -0.55
     Balanced
    -0.53
     Donation
    -0.53
     Decision
    -0.52
     trust
    -0.51
    amen
    -0.51
     Piano
    -0.51
     gratitude
    -0.51
    POSITIVE LOGITS
     abound
    1.03
     prolifer
    0.98
     rampant
    0.98
     everywhere
    0.91
     popping
    0.91
     bloom
    0.89
     lurking
    0.84
     prevalent
    0.81
     emerge
    0.81
     thrive
    0.80
    Act Density 1.348%

    No Known Activations