INDEX
    Explanations

    phrases or sentences emphasizing qualities, conditions, or comparisons

    negating phrases that emphasize limits or exceptions

    New Auto-Interp
    Negative Logits
    ibaba
    -0.89
    etsk
    -0.67
     separat
    -0.63
     Pengu
    -0.63
    ovsky
    -0.60
    ashtra
    -0.59
    abi
    -0.57
     havoc
    -0.57
    ario
    -0.57
     Parenthood
    -0.56
    POSITIVE LOGITS
     means
    0.81
     virtue
    0.79
     leaps
    0.74
    uu
    0.72
     Means
    0.71
    umbers
    0.69
    dB
    0.67
     margins
    0.67
    proxy
    0.65
    products
    0.65
    Act Density 0.228%

    No Known Activations