INDEX
    Explanations

    phrases related to making the world a better place

    concepts related to improving the world or societal betterment

    New Auto-Interp
    Negative Logits
    tein
    -0.75
     persistence
    -0.71
     Removal
    -0.69
     incent
    -0.68
     omission
    -0.68
    IOR
    -0.67
     leakage
    -0.63
    inelli
    -0.62
     retention
    -0.62
    phrase
    -0.62
    POSITIVE LOGITS
     revolves
    0.92
    Thumbnail
    0.81
     darkened
    0.80
     habitable
    0.75
     inhabited
    0.74
     hosp
    0.74
     enslaved
    0.72
     bends
    0.71
    ankind
    0.70
    wake
    0.70
    Act Density 0.322%

    No Known Activations