INDEX
    Explanations

    references to family relationships and personal connections

    Sentences about a woman (often "she")

    New Auto-Interp
    Negative Logits
     himself
    -1.29
    himself
    -1.06
     koji
    -0.99
     Himself
    -0.95
     који
    -0.91
    AndEndTag
    -0.84
     seine
    -0.82
     his
    -0.81
     boyhood
    -0.80
    وفاته
    -0.77
    POSITIVE LOGITS
     herself
    2.12
    herself
    1.53
     her
    1.26
     she
    1.19
     ihrem
    1.09
    حياتها
    0.99
    但她
    0.98
     shes
    0.97
     ihren
    0.94
     그녀
    0.93
    Act Density 1.596%

    No Known Activations