INDEX
    Explanations

    words and phrases expressing negativity or disagreement

    negation and comparison

    New Auto-Interp
    Negative Logits
    rungsseite
    -0.81
     Numerade
    -0.79
     مشين
    -0.66
     NSCoder
    -0.64
    mặt
    -0.63
    tagHelperRunner
    -0.62
    IndentedString
    -0.59
     препратки
    -0.57
     محفوظة
    -0.57
     consultato
    -0.57
    POSITIVE LOGITS
     *_
    0.48
    '));
    
    0.45
     trustworthy
    0.45
     reliable
    0.43
    psons
    0.41
     España
    0.41
    :])
    0.41
    colgroup
    0.41
    ')));
    0.40
     Springsteen
    0.40
    Act Density 1.581%

    No Known Activations