INDEX
    Explanations

    references to humor and satire

    New Auto-Interp
    Negative Logits
    ports
    -0.88
    ignty
    -0.80
    hips
    -0.72
    enfranch
    -0.69
     cryst
    -0.68
    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
    -0.67
     Components
    -0.66
    eded
    -0.66
    uchs
    -0.65
    yer
    -0.65
    POSITIVE LOGITS
    ously
    1.08
     jokes
    0.95
     mocking
    0.90
     humour
    0.86
     parody
    0.86
    osity
    0.85
     joking
    0.84
     satir
    0.84
     joke
    0.83
     humor
    0.83
    Act Density 1.212%

    No Known Activations