INDEX
    Explanations

    adjectives describing negative attributes or actions

    negative connotations and associations related to various topics

    New Auto-Interp
    Negative Logits
    theless
    -0.81
    terday
    -0.69
     Invalid
    -0.65
     Shining
    -0.65
     individually
    -0.64
     Awakening
    -0.64
     Enhanced
    -0.63
    lished
    -0.63
     dated
    -0.63
     enriched
    -0.62
    POSITIVE LOGITS
    ocations
    1.02
    aution
    0.99
    ptions
    0.93
    ours
    0.92
    ippers
    0.92
    notations
    0.90
    angs
    0.90
    tones
    0.89
    oles
    0.88
    urances
    0.88
    Act Density 0.310%

    No Known Activations