INDEX
    Explanations

    verbs related to physical actions or outcomes

    New Auto-Interp
    Negative Logits
     wont
    -0.59
    lihood
    -0.57
     requiring
    -0.56
     unfolding
    -0.55
     applied
    -0.55
     preventing
    -0.55
     unfolded
    -0.54
     specifying
    -0.54
     assisting
    -0.53
     namely
    -0.53
    POSITIVE LOGITS
     oneself
    0.86
     yourselves
    0.84
     yourself
    0.83
     ourselves
    0.75
    ãĥ³ãĤ¸
    0.73
     toes
    0.70
    ulate
    0.68
     noses
    0.68
    kered
    0.65
     tune
    0.65
    Act Density 0.811%

    No Known Activations