INDEX
    Explanations

    expressing preferences or interests

    New Auto-Interp
    Negative Logits
    s
    0.41
    m
    0.32
    0.31
    0.30
    r
    0.29
    0.29
    ы
    0.29
    തിരെ
    0.28
     auront
    0.28
    0.27
    POSITIVE LOGITS
     trabajar
    0.36
     wholeheartedly
    0.36
     storytelling
    0.34
     dearly
    0.34
     moderne
    0.32
     работать
    0.31
     to
    0.31
    uste
    0.30
     buhay
    0.30
     furnishing
    0.30
    Act Density 0.043%

    No Known Activations