INDEX
    Explanations

    words describing positive or negative experiences and emotions

    New Auto-Interp
    Negative Logits
    aucus
    -0.69
    arians
    -0.63
    aan
    -0.62
    arius
    -0.61
    inition
    -0.61
     Feder
    -0.61
    ARS
    -0.60
    govtrack
    -0.59
    ARE
    -0.59
    FIL
    -0.59
    POSITIVE LOGITS
    ries
    1.29
     surprises
    1.04
    ness
    1.01
     smelling
    1.00
    lihood
    0.98
    ties
    0.96
     pleasant
    0.90
    nesses
    0.89
    ments
    0.85
     surpr
    0.81
    Act Density 0.013%

    No Known Activations