INDEX
    Explanations

    references to health issues and popular health treatments in society

    New Auto-Interp
    Negative Logits
    ial
    -0.19
    ander
    -0.17
    eder
    -0.14
     Rudd
    -0.14
    ann
    -0.14
    _traits
    -0.14
    inton
    -0.13
    ushman
    -0.13
     RL
    -0.13
    annes
    -0.13
    POSITIVE LOGITS
    æĿŁ
    0.17
    íĭ±
    0.16
    uler
    0.15
    ç´ł
    0.14
     masses
    0.14
    sı
    0.14
    977
    0.14
    jÃŃt
    0.14
     citiz
    0.14
    nech
    0.13
    Act Density 0.214%

    No Known Activations