INDEX
    Explanations

    adjectives describing the quality or impact of a situation

    expressions of positive or negative outcomes related to policies and news

    New Auto-Interp
    Negative Logits
    racuse
    -0.82
    hyde
    -0.79
    opers
    -0.76
    letes
    -0.72
    avorite
    -0.71
    uckle
    -0.70
    onds
    -0.70
    agos
    -0.70
    lete
    -0.67
    irez
    -0.65
    POSITIVE LOGITS
     news
    1.28
     publicity
    1.17
     manners
    0.98
    news
    0.95
    bye
    0.94
     luck
    0.94
     enough
    0.92
     optics
    0.92
     karma
    0.91
     NEWS
    0.89
    Act Density 0.092%

    No Known Activations