INDEX
    Explanations

    phrases that indicate contradiction or contrast in statements

    followed by a negation

    New Auto-Interp
    Negative Logits
     úteis
    -0.46
     ~
    -0.45
     disambiguazione
    -0.45
    ~
    -0.45
    MessageWindow
    -0.44
     EconPapers
    -0.43
     plati
    -0.43
    del
    -0.43
    dup
    -0.43
    <bos>
    -0.42
    POSITIVE LOGITS
     never
    0.86
     nunca
    0.83
    never
    0.76
     cannot
    0.76
     Never
    0.76
    出版年
    0.75
    Personendaten
    0.75
    ScopeManager
    0.75
    nunca
    0.74
    Never
    0.74
    Act Density 0.481%

    No Known Activations