INDEX
    Explanations

    derogatory or dismissive language related to opinions or reviews

    New Auto-Interp
    Negative Logits
     nakalista
    -0.53
    gnore
    -0.53
    rån
    -0.52
    findpost
    -0.51
     kasarigan
    -0.50
    ::~
    -0.48
    Responder
    -0.48
    TagMode
    -0.48
     nahilalakip
    -0.47
    Autowired
    -0.47
    POSITIVE LOGITS
     nonsense
    1.14
    onsense
    0.95
    nonsense
    0.95
     bullshit
    0.91
     shenanigans
    0.76
     crap
    0.76
    Nonsense
    0.72
     foolishness
    0.71
     fuss
    0.68
     stuff
    0.67
    Act Density 0.225%

    No Known Activations