INDEX
    Explanations

    references to the pronoun "she"

    New Auto-Interp
    Negative Logits
    kefeller
    -0.84
    antage
    -0.75
    emetery
    -0.72
    odder
    -0.71
    vernment
    -0.69
     Observatory
    -0.68
    PDATE
    -0.66
     Skydragon
    -0.65
    undo
    -0.65
    atory
    -0.65
    POSITIVE LOGITS
     herself
    1.50
    pher
    1.43
    athed
    1.27
    athing
    1.23
    pard
    1.21
    pherd
    1.11
    ffield
    1.10
    ikh
    0.99
    ppard
    0.98
    lled
    0.98
    Act Density 0.114%

    No Known Activations