INDEX
    Explanations

    phrases indicating assistance or helpfulness

    New Auto-Interp
    Negative Logits
    setPixel
    -0.49
     admitted
    -0.48
     hates
    -0.47
     interviewed
    -0.46
     orec
    -0.45
    Eternal
    -0.45
     rushed
    -0.45
     erythrocytes
    -0.44
     asked
    -0.44
     preguntar
    -0.44
    POSITIVE LOGITS
     helps
    0.86
     kaarangay
    0.81
     Helps
    0.73
    helps
    0.72
     nhằm
    0.69
     ensures
    0.69
    Helps
    0.68
     affords
    0.67
     ivelany
    0.67
     باعث
    0.67
    Act Density 0.624%

    No Known Activations