INDEX
    Explanations

    parodies and satirical content in text

    instances of parody and satire in the text

    New Auto-Interp
    Negative Logits
    erto
    -0.78
    oard
    -0.75
    Streamer
    -0.75
     violet
    -0.75
    vals
    -0.72
    negie
    -0.72
    rain
    -0.71
    Va
    -0.71
    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
    -0.70
    Transfer
    -0.70
    POSITIVE LOGITS
     satir
    1.25
     satire
    1.25
     spoof
    1.21
     parody
    1.20
     mockery
    1.04
     mocking
    1.04
     satirical
    1.02
     caric
    1.00
     caricature
    0.91
    netflix
    0.88
    Act Density 0.015%

    No Known Activations