INDEX
    Explanations

    words conveying extreme levels of disbelief or ridicule

    descriptors of absurdity and ridiculousness

    New Auto-Interp
    Negative Logits
    yer
    -0.88
    Reviewed
    -0.82
    enfranch
    -0.78
    rien
    -0.74
    avers
    -0.74
    ribution
    -0.71
    maid
    -0.71
    builders
    -0.71
    rounder
    -0.70
    oyal
    -0.70
    POSITIVE LOGITS
    ness
    0.89
     amounts
    0.89
    nesses
    0.89
     nonsense
    0.85
     absurdity
    0.85
    ly
    0.84
    LY
    0.83
     lengths
    0.82
     amount
    0.79
    NESS
    0.79
    Act Density 0.042%

    No Known Activations