INDEX
    Explanations

    positive actions or attributes

    phrases that indicate permission, opportunity, or flexibility

    New Auto-Interp
    Negative Logits
     Cth
    -0.64
    merce
    -0.60
    ittens
    -0.59
    ale
    -0.57
     Epidem
    -0.56
     Pipeline
    -0.56
    anish
    -0.56
     Zimbabwe
    -0.56
     seams
    -0.56
     Consortium
    -0.55
    POSITIVE LOGITS
    thood
    0.83
     choice
    0.74
     opportunity
    0.70
     chance
    0.70
    ãĥİ
    0.68
    license
    0.68
    ãĥĸ
    0.65
    ppo
    0.61
    freedom
    0.61
    vik
    0.60
    Act Density 0.415%

    No Known Activations