INDEX
    Explanations

    phrases instructing or encouraging actions

    commands or instructions that encourage action or engagement

    New Auto-Interp
    Negative Logits
    emale
    -0.71
    edge
    -0.70
    album
    -0.69
    Reply
    -0.61
    ungle
    -0.61
    iege
    -0.59
    hell
    -0.58
    eller
    -0.58
     burden
    -0.57
    apo
    -0.57
    POSITIVE LOGITS
     yourselves
    1.05
     yourself
    1.02
    ings
    0.85
     Yourself
    0.83
    ardless
    0.81
     thou
    0.79
     ye
    0.75
     ya
    0.71
     Tata
    0.71
    able
    0.68
    Act Density 0.209%

    No Known Activations