INDEX
    Explanations

    references to medical or health-related terminology

    New Auto-Interp
    Negative Logits
    i
    -0.65
    er
    -0.61
    o
    -0.52
    ÛĮ
    -0.51
    ed
    -0.51
    a
    -0.50
    an
    -0.46
    es
    -0.46
    y
    -0.44
    и
    -0.43
    POSITIVE LOGITS
    tempts
    0.26
    imately
    0.23
    dependent
    0.22
    stantiate
    0.20
    OwnProperty
    0.18
    imizer
    0.17
    دÙĪØ§Ø¬
    0.16
    onical
    0.15
    liv
    0.15
    ÙĬات
    0.15
    Act Density 1.299%

    No Known Activations