INDEX
    Explanations

    references to jokes or humorous statements

    mentions of jokes or humorous references

    New Auto-Interp
    Negative Logits
    phalt
    -0.83
    undai
    -0.82
    porting
    -0.72
    enture
    -0.72
    CLASSIFIED
    -0.69
    rive
    -0.68
    ignty
    -0.66
    ighting
    -0.66
    orneys
    -0.66
    ports
    -0.66
    POSITIVE LOGITS
    ously
    0.99
     jokes
    0.91
     joking
    0.86
     joke
    0.86
    bags
    0.81
    bag
    0.79
    osal
    0.79
     caller
    0.77
     Pom
    0.77
     mocking
    0.76
    Act Density 0.019%

    No Known Activations