INDEX
    Explanations

    phrases related to actions or events happening or being done 'to' someone or something

    instances of the word "to" indicating purpose or intention

    New Auto-Interp
    Negative Logits
    typ
    -0.60
     disadvant
    -0.59
    emort
    -0.57
     hemor
    -0.55
     behav
    -0.54
     shenan
    -0.54
    tun
    -0.54
     Seym
    -0.53
     vulner
    -0.53
     Vaugh
    -0.53
    POSITIVE LOGITS
    ggles
    0.83
    wered
    0.81
    ilet
    0.77
     celebrate
    0.76
     relieve
    0.74
     obtain
    0.74
     promote
    0.73
     accommodate
    0.72
    asted
    0.72
     avoid
    0.72
    Act Density 0.199%

    No Known Activations