INDEX
    Explanations

    mention of movies, actors, and directors

    references to comedy and comedic elements

    New Auto-Interp
    Negative Logits
    ribute
    -0.77
    orem
    -0.74
    fortune
    -0.74
    animous
    -0.73
    accompan
    -0.73
    drawn
    -0.72
    ributed
    -0.72
    inness
    -0.71
    inguished
    -0.71
    scrib
    -0.70
    POSITIVE LOGITS
    edy
    1.08
     Reloaded
    0.76
     Unleashed
    0.68
     Zed
    0.67
     kW
    0.67
     Enterprises
    0.66
    bda
    0.65
     Bang
    0.64
    tsky
    0.64
    deen
    0.63
    Act Density 0.009%

    No Known Activations