INDEX
    Explanations

    references to paternal figures or authority figures

    references to "fathers" and parental figures in various contexts

    New Auto-Interp
    Negative Logits
    FW
    -0.83
    é¾
    -0.74
     Flavoring
    -0.71
    IFF
    -0.71
    CHAT
    -0.71
    ELL
    -0.68
    eq
    -0.68
    mble
    -0.67
    ORT
    -0.64
    AH
    -0.63
    POSITIVE LOGITS
    father
    1.22
     Fathers
    1.05
    sonian
    0.89
    parents
    0.88
     fathers
    0.87
    ures
    0.86
    hood
    0.81
    stein
    0.79
    sson
    0.79
    essor
    0.77
    Act Density 0.005%

    No Known Activations