INDEX
    Explanations

    negations and expressions of doubt or denial

    New Auto-Interp
    Negative Logits
     not
    -0.19
    asn
    -0.15
     não
    -0.15
     ikke
    -0.15
     somewhat
    -0.15
     no
    -0.15
     nicht
    -0.14
    772
    -0.14
    bruar
    -0.14
     never
    -0.14
    POSITIVE LOGITS
    oriously
    0.25
    ori
    0.25
     anymore
    0.24
     necessarily
    0.23
    ched
    0.23
    epad
    0.22
    ches
    0.21
     yet
    0.19
    tingham
    0.19
    ional
    0.18
    Act Density 0.289%

    No Known Activations