INDEX
    Explanations

    phrases emphasizing a specific idea or point

    references to important messages or themes

    New Auto-Interp
    Negative Logits
    engeance
    -0.73
    ords
    -0.71
    urses
    -0.68
    erenn
    -0.68
    ENCY
    -0.67
    agonists
    -0.67
    enses
    -0.65
    itates
    -0.65
    umbers
    -0.65
    endars
    -0.64
    POSITIVE LOGITS
     message
    1.05
     messages
    1.00
     Messages
    0.91
    message
    0.91
     conveyed
    0.87
    board
    0.83
    posts
    0.83
    FontSize
    0.82
     goodbye
    0.80
    Message
    0.77
    Act Density 0.025%

    No Known Activations