INDEX
    Explanations

    expressions related to goals, plans, and intentions

    New Auto-Interp
    Negative Logits
    Guard
    -0.75
    sis
    -0.70
    SPONSORED
    -0.64
    hook
    -0.61
    fax
    -0.61
    éĸ
    -0.61
    oran
    -0.61
    rm
    -0.60
    dating
    -0.60
     incl
    -0.60
    POSITIVE LOGITS
     simplicity
    0.83
     consistency
    0.80
     clarity
    0.77
     seamless
    0.74
     healthy
    0.74
     faire
    0.74
    imize
    0.72
     fairness
    0.72
     disruptive
    0.72
     awareness
    0.71
    Act Density 0.299%

    No Known Activations