INDEX
    Explanations

    negations and phrases indicating exclusivity or rarity

    New Auto-Interp
    Negative Logits
    lamaz
    -0.15
    euillez
    -0.14
    ounder
    -0.14
    eyin
    -0.14
    ght
    -0.14
    dera
    -0.14
    esser
    -0.13
    ứt
    -0.13
     offsetof
    -0.13
    opia
    -0.13
    POSITIVE LOGITS
     did
    0.71
     does
    0.66
     do
    0.60
    did
    0.57
     Did
    0.53
    Did
    0.52
    does
    0.51
     Does
    0.50
    .did
    0.42
    Does
    0.41
    Act Density 0.216%

    No Known Activations