INDEX
    Explanations

    text related to criticizing or mocking others

    New Auto-Interp
    Negative Logits
     Warehouse
    -0.72
     rebuilt
    -0.71
     Located
    -0.67
    romeda
    -0.64
     pioneering
    -0.64
    chnology
    -0.61
    bitious
    -0.61
    phalt
    -0.61
    ufact
    -0.60
    erenn
    -0.60
    POSITIVE LOGITS
     sarcastic
    1.02
     slurs
    1.00
     ridicule
    0.98
     misinterpret
    0.96
     misunderstand
    0.95
     insults
    0.92
     insulting
    0.92
     replies
    0.91
     jokes
    0.90
     condesc
    0.89
    Act Density 0.862%

    No Known Activations