INDEX
    Explanations

    expressions of requests or conditions related to individuals and their actions or statuses

    New Auto-Interp
    Negative Logits
    azo
    -0.17
    said
    -0.15
    說
    -0.14
     said
    -0.14
     Twilight
    -0.14
     nab
    -0.14
     get
    -0.14
    adoras
    -0.14
     says
    -0.13
    says
    -0.13
    POSITIVE LOGITS
     wish
    0.71
     Wish
    0.67
    wish
    0.65
     wishes
    0.64
     wished
    0.59
     wishing
    0.54
     desire
    0.31
     require
    0.30
     souha
    0.29
     wishlist
    0.28
    Act Density 0.223%

    No Known Activations