INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    เว
    -0.07
     contradictory
    -0.07
    ービ
    -0.07
    _points
    -0.07
    okino
    -0.07
     nét
    -0.06
    -0.06
     sph
    -0.06
    ubyte
    -0.06
     Ratio
    -0.06
    POSITIVE LOGITS
     sick
    0.09
     ill
    0.08
     Sick
    0.08
    cks
    0.07
     illness
    0.07
    (en
    0.07
     Ill
    0.07
     illnesses
    0.07
     willingness
    0.07
     smoker
    0.06
    Act Density 0.012%

    No Known Activations