INDEX
    Explanations

    proper nouns related to familial relationships (e.g., Father, John)

    references to father figures and paternal relationships

    New Auto-Interp
    Negative Logits
    mble
    -0.79
    ellen
    -0.73
     externalToEVAOnly
    -0.68
    EVA
    -0.67
    pace
    -0.65
     hijab
    -0.63
    Women
    -0.63
    burgh
    -0.62
    lled
    -0.62
    ibr
    -0.61
    POSITIVE LOGITS
    hood
    1.15
    volent
    0.99
     patriarch
    0.98
    father
    0.89
     figure
    0.84
    hetical
    0.82
    hesis
    0.82
    liest
    0.77
    land
    0.76
    less
    0.72
    Act Density 0.063%

    No Known Activations