INDEX
    Explanations

    phrases or sentences indicating unfortunate situations or negative consequences

    expressions of regret or disappointment

    New Auto-Interp
    Negative Logits
    tein
    -0.77
    ingham
    -0.74
    zag
    -0.71
    arnaev
    -0.71
    afort
    -0.70
    arij
    -0.70
     kindred
    -0.69
    icle
    -0.67
    ipers
    -0.67
    appro
    -0.67
    POSITIVE LOGITS
     adolesc
    0.71
    nces
    0.68
     Delicious
    0.68
    imaru
    0.68
    ÃĽ
    0.64
    é¾į
    0.64
    ESA
    0.63
     reproduce
    0.63
     unfortunate
    0.62
     Tos
    0.62
    Act Density 0.016%

    No Known Activations