INDEX
    Explanations

    adjectives describing positive experiences

    descriptions of pleasant or unpleasant experiences and feelings

    New Auto-Interp
    Negative Logits
    arius
    -0.68
     blindly
    -0.64
    aan
    -0.63
    ULE
    -0.62
    inition
    -0.62
    ithing
    -0.61
    limits
    -0.61
    mining
    -0.61
    govtrack
    -0.60
    rules
    -0.60
    POSITIVE LOGITS
    ries
    1.09
    lihood
    0.98
     pleasant
    0.89
     smelling
    0.86
     surprises
    0.82
    ness
    0.81
    ãĥ¼ãĥĨ
    0.77
     experiences
    0.77
    rious
    0.76
     unpleasant
    0.76
    Act Density 0.031%

    No Known Activations