INDEX
    Explanations

    numeric comparisons, specifically instances where one value is multiple times larger than another value

    New Auto-Interp
    Negative Logits
    isson
    -0.82
    behind
    -0.80
    perse
    -0.76
    artifacts
    -0.74
    fy
    -0.72
    bay
    -0.72
    among
    -0.70
    VPN
    -0.69
    galitarian
    -0.69
     alike
    -0.68
    POSITIVE LOGITS
     usual
    1.26
     amount
    1.20
     allowable
    1.14
     maximum
    1.14
     average
    1.10
     median
    1.07
     original
    1.06
     rate
    1.04
     sum
    1.04
     lowest
    1.02
    Act Density 0.119%

    No Known Activations