INDEX
    Explanations

    references to familial relationships, especially involving fathers

    New Auto-Interp
    Negative Logits
    licet
    -0.80
     الحره
    -0.71
     Jovi
    -0.71
     trover
    -0.70
     öss
    -0.68
    -0.67
    houette
    -0.66
     delu
    -0.66
     coils
    -0.65
    ulets
    -0.64
    POSITIVE LOGITS
     father
    1.78
     Father
    1.66
     fathers
    1.66
     FATHER
    1.66
     Fathers
    1.61
    Father
    1.50
    father
    1.46
     père
    1.32
     Vater
    1.22
     padre
    1.21
    Act Density 0.047%

    No Known Activations