INDEX
    Explanations

    specific expressions of negation or contradiction

    New Auto-Interp
    Negative Logits
     '{@
    -0.40
     Wan
    -0.38
     Basili
    -0.36
     Fish
    -0.35
     Book
    -0.34
     resist
    -0.34
     Yes
    -0.33
    ess
    -0.33
    anti
    -0.32
    cy
    -0.32
    POSITIVE LOGITS
    httphttps
    0.83
    ftagPool
    0.77
    sizeCache
    0.76
     surla
    0.72
    complexContent
    0.66
    rungsseite
    0.61
    AddTagHelper
    0.60
     miniaturka
    0.59
     zijne
    0.59
    Personendaten
    0.59
    Act Density 0.065%

    No Known Activations