INDEX
    Explanations

    references to a specific individual or entity, particularly focusing on possessive pronouns and direct mentions of that individual

    New Auto-Interp
    Negative Logits
     Houſe
    -0.70
     auffi
    -0.70
     raiſ
    -0.69
     itſelf
    -0.69
     uſe
    -0.67
     cauſe
    -0.67
     myſelf
    -0.67
     pleaſure
    -0.66
     Theodo
    -0.64
     Aphrodite
    -0.64
    POSITIVE LOGITS
     his
    3.85
    his
    3.09
    His
    2.77
     His
    2.70
     HIS
    2.44
     himself
    2.40
     him
    2.37
    彼の
    2.33
    他的
    2.24
     he
    2.23
    Act Density 0.235%

    No Known Activations