INDEX
    Explanations

    references to fathers and father figures

    New Auto-Interp
    Negative Logits
     itſelf
    -1.12
     iſt
    -1.09
     ſche
    -1.00
     myſelf
    -0.97
     Majefty
    -0.91
     Diſ
    -0.91
    ſelves
    -0.91
     beſt
    -0.90
    ſelf
    -0.88
     houſe
    -0.88
    POSITIVE LOGITS
     father
    0.93
     Father
    0.92
    Father
    0.83
    father
    0.83
     père
    0.74
     FATHER
    0.69
     fathers
    0.65
    fathers
    0.63
     Mr
    0.57
    ::::::::
    0.55
    Act Density 0.078%

    No Known Activations