INDEX
    Explanations

    references to average and statistical measures

    New Auto-Interp
    Negative Logits
    ud
    -0.17
    ad
    -0.17
    enames
    -0.16
    ãĥ¼ãĥĭ
    -0.16
     averages
    -0.15
    oper
    -0.15
    _avg
    -0.15
    ams
    -0.15
    elig
    -0.15
    ids
    -0.15
    POSITIVE LOGITS
     joe
    0.28
     Joe
    0.22
    Joe
    0.21
    /std
    0.19
    -case
    0.19
    -sized
    0.19
    -priced
    0.18
    abwe
    0.17
    bilt
    0.16
     sac
    0.16
    Act Density 0.024%

    No Known Activations