INDEX
    Explanations

    instances of satire in various contexts, particularly related to cultural commentary

    New Auto-Interp
    Negative Logits
    quier
    -0.19
    edly
    -0.17
    _PATCH
    -0.16
    acl
    -0.15
    ftar
    -0.15
    slt
    -0.15
    že
    -0.15
    andles
    -0.15
    adden
    -0.15
    laces
    -0.15
    POSITIVE LOGITS
    uration
    0.30
    suma
    0.30
    irical
    0.30
    anic
    0.29
    ellite
    0.28
    ellites
    0.25
    elite
    0.24
    iation
    0.24
    URATION
    0.22
    ires
    0.22
    Act Density 0.008%

    No Known Activations