INDEX
    Explanations

    the word "joke" followed by either the number 9 or 10

    the concept of "joke" in various contexts

    New Auto-Interp
    Negative Logits
    undai
    -0.71
    phalt
    -0.67
    ignty
    -0.67
    ignt
    -0.66
    EVA
    -0.64
    actions
    -0.64
    enture
    -0.63
    ills
    -0.63
    CLASSIFIED
    -0.63
    arnaev
    -0.62
    POSITIVE LOGITS
    ously
    1.14
     jokes
    0.93
    osal
    0.83
     joking
    0.82
     mocking
    0.81
    bag
    0.79
    sters
    0.79
    writer
    0.78
     joke
    0.77
    bags
    0.76
    Act Density 0.049%

    No Known Activations