INDEX
    Explanations

    prepositions and verbs related to movement or action

    prepositions and phrases indicating direction or purpose

    New Auto-Interp
    Negative Logits
    cember
    -0.68
    Released
    -0.65
     teasp
    -0.58
    uber
    -0.58
    pection
    -0.57
    aird
    -0.57
    jo
    -0.57
    arnaev
    -0.56
    iam
    -0.56
    ird
    -0.56
    POSITIVE LOGITS
     oneself
    0.94
     whatever
    0.81
     anything
    0.80
    addons
    0.77
     whichever
    0.75
    uate
    0.70
    WARD
    0.70
     any
    0.67
     something
    0.67
     Yourself
    0.65
    Act Density 0.534%

    No Known Activations