INDEX
    Explanations

    negative sentiments or refusals

    New Auto-Interp
    Negative Logits
     soigne
    -0.96
     habile
    -0.94
     Chapitre
    -0.94
     épu
    -0.90
     Intere
    -0.85
     Simult
    -0.85
     Confu
    -0.85
     hcm
    -0.84
     triomphe
    -0.84
     désol
    -0.83
    POSITIVE LOGITS
     autorytatywna
    0.82
    <bos>
    0.69
     wont
    0.66
     be
    0.64
    Cyfeiriadau
    0.63
     necessarily
    0.62
     intptr
    0.62
     won
    0.62
     disambiguazione
    0.60
    Personensuche
    0.59
    Act Density 0.159%

    No Known Activations