INDEX
    Explanations

    references to familial relationships, particularly focusing on sons and daughters

    New Auto-Interp
    Negative Logits
    ibel
    -0.20
    incinn
    -0.17
    TURE
    -0.17
    istring
    -0.16
    abies
    -0.16
    ture
    -0.15
    yor
    -0.15
    emales
    -0.15
     ancestor
    -0.15
    tl
    -0.14
    POSITIVE LOGITS
    -in
    0.31
    orous
    0.28
    hood
    0.25
    eren
    0.23
    eral
    0.21
    -IN
    0.20
    nets
    0.19
    HO
    0.17
    less
    0.17
    ntag
    0.17
    Act Density 0.043%

    No Known Activations