INDEX
    Explanations

    instances of verbs or phrases indicating intention or direction

    New Auto-Interp
    Negative Logits
    usher
    -0.16
    евеÑĢ
    -0.15
    åŁº
    -0.14
    lixir
    -0.14
    adera
    -0.14
    esian
    -0.13
     uÄŁ
    -0.13
    lamaz
    -0.13
     سر
    -0.13
    à¸Ľà¸£à¸°à¸Īำ
    -0.13
    POSITIVE LOGITS
     innovate
    0.16
    967
    0.15
     live
    0.15
    å¾®ç¬ij
    0.14
     Holl
    0.14
    ucceed
    0.14
    OTS
    0.14
    otec
    0.14
    .chapter
    0.14
     succeed
    0.14
    Act Density 0.566%

    No Known Activations