INDEX
    Explanations

    references to father figures and familial relationships

    New Auto-Interp
    Negative Logits
    licet
    -0.74
    uilla
    -0.69
    __))
    -0.65
    -0.65
    houette
    -0.64
     Pickett
    -0.63
     öss
    -0.61
     delu
    -0.61
    vrons
    -0.60
    geries
    -0.60
    POSITIVE LOGITS
     fathers
    1.56
     Fathers
    1.55
     father
    1.50
     FATHER
    1.47
     Father
    1.41
    Father
    1.27
    father
    1.23
     père
    1.16
     Fath
    1.14
     Père
    1.13
    Act Density 0.091%

    No Known Activations