INDEX
    Explanations

    phrases expressing negative comparisons

    New Auto-Interp
    Negative Logits
    -0.80
    iibo
    -0.66
     transfieras
    -0.64
    phrag
    -0.63
    featureID
    -0.63
    transQ
    -0.59
    modelBuilder
    -0.58
     iPads
    -0.58
    PageModule
    -0.58
     Blades
    -0.58
    POSITIVE LOGITS
    worst
    1.05
    worse
    1.02
    Worse
    1.00
    Worst
    1.00
     worse
    0.98
     worst
    0.97
     Worse
    0.94
     Worst
    0.91
     peores
    0.83
     pior
    0.77
    Act Density 0.010%

    No Known Activations