INDEX
    Explanations

    terms related to emotional and physical health

    New Auto-Interp
    Negative Logits
    berger
    -0.17
    este
    -0.17
    sky
    -0.15
    erva
    -0.15
    abr
    -0.15
    ny
    -0.14
    âĢİ
    -0.14
    wa
    -0.14
    lick
    -0.14
    ÑĥÑģк
    -0.14
    POSITIVE LOGITS
    ément
    0.18
     Roch
    0.14
    Ú¯ÛĮرÛĮ
    0.14
    ãĥĥãĥĹ
    0.14
    inan
    0.14
     nÃło
    0.14
    _here
    0.13
    _PULL
    0.13
     grátis
    0.13
    оÑģков
    0.13
    Act Density 0.468%

    No Known Activations