INDEX
    Explanations

    mentions or descriptions of daughters

    New Auto-Interp
    Negative Logits
    kefeller
    -0.87
     constitu
    -0.81
    ilitarian
    -0.72
    uchin
    -0.70
     dstg
    -0.69
    hent
    -0.69
     Tribunal
    -0.67
    psey
    -0.67
    etting
    -0.66
    vernment
    -0.66
    POSITIVE LOGITS
     Ivanka
    0.96
    hood
    0.82
     Louise
    0.81
     daughter
    0.80
    Anne
    0.79
     Isabel
    0.77
     daughters
    0.77
    girl
    0.73
     Hannah
    0.73
     Daughter
    0.71
    Act Density 0.016%

    No Known Activations