INDEX
    Explanations

    references to jokes

    occurrences of the word "jokes."

    New Auto-Interp
    Negative Logits
    ignty
    -0.69
    CLASSIFIED
    -0.67
    ioch
    -0.63
     violet
    -0.62
     Borders
    -0.61
    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
    -0.60
    orneys
    -0.60
    itizen
    -0.58
    JUST
    -0.58
    ignt
    -0.57
    POSITIVE LOGITS
     jokes
    1.02
    linger
    0.81
    ters
    0.80
     joking
    0.78
    ster
    0.78
     banter
    0.78
    sters
    0.76
    ong
    0.75
    osal
    0.74
    itone
    0.74
    Act Density 0.010%

    No Known Activations