INDEX
    Explanations

    instances of the word "to" used for expressing intention or purpose

    New Auto-Interp
    Negative Logits
    wart
    -0.16
    reich
    -0.15
    agen
    -0.15
    夫
    -0.14
    atar
    -0.14
    enton
    -0.14
    jes
    -0.14
    iew
    -0.14
    adel
    -0.14
    ickle
    -0.14
    POSITIVE LOGITS
     because
    0.21
    because
    0.17
     karena
    0.17
     porque
    0.16
    anse
    0.15
    ERRU
    0.15
     omdat
    0.15
    缮ãĤĴ
    0.15
     поÑĤомÑĥ
    0.14
     Because
    0.14
    Act Density 0.245%

    No Known Activations