INDEX
    Explanations

    mentions of family members, particularly sisters, in various contexts

    mentions of the word "sister."

    New Auto-Interp
    Negative Logits
    ulkan
    -0.71
    ateurs
    -0.64
    beit
    -0.64
    imal
    -0.63
    vt
    -0.62
    urable
    -0.61
    holes
    -0.61
    animate
    -0.60
    ambo
    -0.60
    hematically
    -0.60
    POSITIVE LOGITS
     sister
    3.63
     sisters
    2.53
     brother
    2.39
     sibling
    2.30
     niece
    2.03
     cousin
    2.02
     Sister
    1.96
     daughter
    1.95
     siblings
    1.85
     aunt
    1.71
    Act Density 0.007%

    No Known Activations