INDEX
    Explanations

    personal pronouns and verb phrases

    references to specific individuals and expressions of gratitude or acknowledgment

    New Auto-Interp
    Negative Logits
    acket
    -0.70
    indal
    -0.69
    Prev
    -0.67
    VERTISEMENT
    -0.65
    elson
    -0.65
    theless
    -0.64
    ossibility
    -0.63
    livious
    -0.63
    ocity
    -0.62
    ogether
    -0.61
    POSITIVE LOGITS
     hereby
    0.82
     bearer
    0.80
     swear
    0.79
     thank
    0.76
     thou
    0.75
     dear
    0.74
     fuck
    0.73
     thy
    0.72
     ye
    0.71
     weep
    0.67
    Act Density 0.276%

    No Known Activations