INDEX
    Explanations

    phrases expressing positive sentiment or approval

    expressions of enthusiasm or positivity

    New Auto-Interp
    Negative Logits
    mitt
    -0.68
    ilus
    -0.67
    canon
    -0.66
    clips
    -0.64
    former
    -0.62
    ethyl
    -0.62
    ople
    -0.62
    pex
    -0.62
    missions
    -0.62
    eter
    -0.62
    POSITIVE LOGITS
    sword
    0.91
     opportunity
    0.89
     strides
    0.82
     deal
    0.81
     outdoors
    0.81
     fun
    0.80
     idea
    0.79
     tasting
    0.76
     insight
    0.76
    shield
    0.73
    Act Density 0.068%

    No Known Activations