INDEX
    Explanations

    fostering positive outcomes

    New Auto-Interp
    Negative Logits
     captivating
    0.58
    0.58
    <unused569>
    0.58
    0.54
    展现
    0.52
    0.52
     масштаб
    0.52
    无论是
    0.52
     undeniably
    0.51
    0.51
    POSITIVE LOGITS
     (
    0.61
     abortions
    0.58
     abusing
    0.57
     deaths
    0.57
     very
    0.55
     diseases
    0.55
     died
    0.52
     drugs
    0.52
     was
    0.51
    info
    0.51
    Act Density 0.014%

    No Known Activations