INDEX
    Explanations

    expressions of familial relationships and affection

    New Auto-Interp
    Negative Logits
     Actress
    -0.18
     Woman
    -0.17
     mistress
    -0.17
     lady
    -0.17
    osit
    -0.17
    woman
    -0.17
     woman
    -0.16
    Lady
    -0.16
     Lady
    -0.15
     pregnant
    -0.15
    POSITIVE LOGITS
     dad
    0.65
     Dad
    0.63
     father
    0.59
     dads
    0.59
     Father
    0.57
     fathers
    0.56
     daddy
    0.54
    dad
    0.52
     Fathers
    0.52
    Father
    0.51
    Act Density 0.066%

    No Known Activations