INDEX
    Explanations

    words related to negative experiences or actions

    New Auto-Interp
    Negative Logits
    DonaldTrump
    -0.65
     Paddock
    -0.59
    ibrary
    -0.56
     Eden
    -0.55
     Kaiser
    -0.55
     House
    -0.54
     Collider
    -0.53
     Everest
    -0.53
     Wildcats
    -0.52
     Pillar
    -0.52
    POSITIVE LOGITS
     alike
    0.73
    eworthy
    0.71
    ifies
    0.71
    ilit
    0.71
     versa
    0.69
    ifying
    0.66
    ining
    0.66
    ify
    0.66
    igmat
    0.66
    aund
    0.65
    Act Density 0.253%

    No Known Activations