INDEX
    Explanations

    negative phrasing and expressions of doubt or uncertainty

    New Auto-Interp
    Negative Logits
    nt
    -0.17
    zelf
    -0.14
     INCIDENTAL
    -0.14
    kontakte
    -0.13
    ittal
    -0.13
    CTL
    -0.13
    ule
    -0.13
    -caret
    -0.13
    arc
    -0.13
    ÄĽk
    -0.13
    POSITIVE LOGITS
    erspective
    0.15
    ubat
    0.14
    idente
    0.14
    acket
    0.13
    ispens
    0.13
    /off
    0.13
    sob
    0.13
    çe
    0.13
    epad
    0.13
    omentum
    0.13
    Act Density 0.009%

    No Known Activations