INDEX
    Explanations

    phrases related to improving the world or making it a better place

    New Auto-Interp
    Negative Logits
    tein
    -0.97
     Removal
    -0.77
    oyal
    -0.73
     omission
    -0.69
    roversial
    -0.68
     caut
    -0.67
     interval
    -0.66
     effectiveness
    -0.63
     persistence
    -0.63
    itone
    -0.63
    POSITIVE LOGITS
     revolves
    0.83
    Thumbnail
    0.80
     engulfed
    0.75
     anew
    0.74
     darkened
    0.73
    wake
    0.71
     liv
    0.71
     trillions
    0.69
     ravaged
    0.67
    opolis
    0.67
    Act Density 0.469%

    No Known Activations