INDEX
    Explanations

    mentions of the word "Hum" with comparably high activations

    references to humor or comedic elements

    New Auto-Interp
    Negative Logits
    cape
    -0.71
     Peaks
    -0.71
     ---------
    -0.71
     EntityItem
    -0.70
    ãĤ´ãĥ³
    -0.68
    å§
    -0.65
    hips
    -0.65
     flagged
    -0.64
     Sands
    -0.62
     approve
    -0.62
    POSITIVE LOGITS
    pty
    1.06
    ility
    1.04
    orously
    1.03
    iday
    1.02
    undai
    1.01
    ankind
    0.99
     Hum
    0.92
    Hum
    0.88
    mers
    0.87
    obiles
    0.86
    Act Density 0.012%

    No Known Activations