INDEX
    Explanations

    references to interpersonal connections and relationships

    New Auto-Interp
    Negative Logits
    adol
    -0.17
    sis
    -0.16
    iв
    -0.16
    olini
    -0.16
    алов
    -0.15
    alet
    -0.15
    /apt
    -0.15
    üst
    -0.15
    èª
    -0.14
    à¤¿à¤Ł
    -0.14
    POSITIVE LOGITS
     another
    0.46
    another
    0.40
    -an
    0.40
     ano
    0.32
     Another
    0.28
    Another
    0.28
    _an
    0.28
     otro
    0.27
    ano
    0.26
    دÛĮگر
    0.25
    Act Density 0.006%

    No Known Activations