INDEX
    Explanations

    instances of humor and absurdity in everyday situations

    New Auto-Interp
    Negative Logits
     Notice
    -0.54
    Notice
    -0.54
    Sign
    -0.53
    <bos>
    -0.52
     pre
    -0.48
     Sign
    -0.48
    HandlerContext
    -0.48
     notice
    -0.48
    rinfo
    -0.47
    Dynamic
    -0.46
    POSITIVE LOGITS
     literal
    0.70
     EconPapers
    0.67
    ThroughAttribute
    0.65
     literalmente
    0.65
    addCriterion
    0.64
    aarrggbb
    0.63
    Personendaten
    0.61
     literally
    0.61
    Rüyada
    0.61
    featureID
    0.60
    Act Density 0.428%

    No Known Activations