INDEX
    Explanations

    words related to various scientific and medical terms

    New Auto-Interp
    Negative Logits
    m
    -0.82
    ness
    -0.74
    s
    -0.71
    baix
    -0.68
    ling
    -0.67
     silêncio
    -0.66
    us
    -0.65
    r
    -0.64
    ms
    -0.64
    masing
    -0.63
    POSITIVE LOGITS
    auso
    1.03
    eo
    0.98
     Quo
    0.97
    o
    0.95
     Ando
    0.95
    оо
    0.94
     Malo
    0.91
     Dodo
    0.91
     Puro
    0.91
    eco
    0.90
    Act Density 1.174%

    No Known Activations