INDEX
    Explanations

    references to physical touch and emotions

    New Auto-Interp
    Negative Logits
    hips
    -0.16
    iyat
    -0.16
    aires
    -0.15
    monds
    -0.14
    liga
    -0.14
    agar
    -0.14
    /people
    -0.14
    ега
    -0.14
    تÙģ
    -0.14
    abajo
    -0.14
    POSITIVE LOGITS
    ings
    0.17
    aroo
    0.17
     followed
    0.17
     session
    0.16
    ero
    0.16
    ingly
    0.16
    /update
    0.15
    down
    0.15
    tings
    0.15
    ibr
    0.15
    Act Density 0.174%

    No Known Activations